- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
New Journey of HBase in Alibaba & Cloud
展开查看详情
1 .New Journey of HBase in Alibaba and Cloud 八年磨一剑,HBase在阿里巴巴和云上的新征程 Chunhui Shen and Long Cao August 17,2018
2 .Content 01 AliHB-Introduction of Alibaba HBase History,Tech Overview,Open Source,Core Scenarios 02 Recent Key Challenge & Improvements GC Trouble,Separation of Computing & Storage,Cold- Hot Data,Diagnostic System, Migration & Backup 03 HBase Ecosystem & Multi-model DB & Cloud KV,Tabular,SQL,Graph,Time Series,Geospatial , Search, Mixed Workloads,Cloud
3 .01 AliHB-Introduction of Alibaba HBase
4 .HBase History in Alibaba Data Burst Commercial Big Data Open Source Store System Cassandra Develop New Our Choice in 2010 MySQL、Oracle • Used Version • Why HBase – 0.20->0.90->0.92->0.94->0.98->1.1->2.0 – Began using since 2010 • The earliest case in 2010-2011 – Active community – Search Store – Hadoop ecosystem – Taobao History Order – Facebook successful case – Alipay Risk Management – Google famous paper: Big Table • Internal branch AliHB
5 .Overview of AliHB • Performance • High-Performance Data Structure、Lock-Free、 Group IO • Feature • SQL、Secondary Index • Multi-Tenants、Cold-Hot Separation、Async API • Stability • High Availability Architecture • Faster MTTR • Verification in Double 11 Shopping Day • Efficient Maintenance • Effective Monitoring • Full Path Trace • No-pause migration • 12000+ Nodes,100+ Clusters ,200+ Million OPS,100+ PB Data • 20+ BU,6000+ Users, 100+ Production Changes per Day 5
6 .Open Source and Community • Contributing to open source since 2011 • 3 PMC, 6 Committers in Alibaba • Sponsor the Chinese HBase Technology Community • Already Organized 2 HBase Meetup • At least one HBase Related tech article one day • Tens of thousands of readers now, and more are coming • Hosting HBase Con Asia 2018 • Promote the use of HBase through several conference talks • Hope more people to join in HBase Community 6
7 .Core Scenarios in Alibaba Monitor, Log, AI Storage Recommendation Message, Orders, Feeds … Tracking, IoT Data… Search, BI Report… Ant Intelligent Security 旺旺(IM) Intelligent Customer Service Log Alipay Bills Cainiao Logistics Ali-HBase 7
8 .02 Recent Key Challenge & Improvements
9 .GC Trouble GC Problems Under100GB Memory Frequent Very Slow Service Slow Request Request Unavailable 9
10 .GC Trouble Only for offline application Exploring a Thorough Solution Rewriting with C++ 10
11 .GC Trouble Type Pause Time Frequency Allocation and reclaim the major memory YGC 100ms+ Once per 5 Secs by hbase itself, rather than JVM CMS 100~500ms Once per 5 Mins FGC 20s-180s Once per 7~60 Days CCSMap BucketCacheV2 New GC algorithm in AJDK ZenGC Type Pause Time Frequency YGC 5ms Once per 5 Secs CMS 100ms Once per 5 Hours Try best to reuse object(In Core Path) when FGC N/A N/A programming 11
12 .GC Trouble New BucketCache in HBase-2.0 CCSMap in HBase-3.0
13 .Separation of Computing & Storage Localized Deployment – Low IO latency with Short-Circuit Read – Unbalanced storage space, especially between clusters – Difficult to increase the usage ratio of CPU and Disk (both), especially when lots of scenarios – Cluster scaling is slow because of datanode decommission 13
14 .Separation of Computing & Storage – Big shared storage, more balanced – Compute node can scale independently – Storage node can scale independently – Auto-scaling become feasible – Based on load statistics, smart schedule between clusters – Share compute resources with other applications Shared-Storage Deployment 14
15 .Heterogeneous Cold-Hot Storage • HBase has the capability to hold all the data of whole life cycle • But in most cases, like monitor, trace, order, logistics • The recently generated data is often accessed, but occupy very little storage space • The history data is rarely visited, but occupy a lot of storage space • Common solution • Cold storage system for history data • Hot storage system for recent data • Move the data from hot storage system to cold storage system periodically 15
16 .Heterogeneous Cold-Hot Storage • Easy To Use • Auto Tiered • Heterogeneous • Read Optimization 16
17 .Diagnostic System 12000+ Nodes,100+ Clusters ,6000+ Users “Request Rush?” — Monitor “Big Region?” — Web UI “Full Disk?” — df “Bad Disk?” — tsar,demsg …… HBase Diagnostic Center 1. The unified entrance of trouble shooting 2. Experience/Solution => Function of Diagnostic System 17
18 .Diagnostic System One extra server for all 2 No Agent Adding rule dynamically Runtime information Check all components 6 Only 10 seconds for a diagnosis 18
19 .Diagnostic System Shared on Apsara HBase 50+ 80%+ Rules Accuracy HBase ZK/HDFS Hardware Compaction ZK Unavailable Stuck Insufficient disk space Block Miss Balance Abnormal Slow Disk NameNode Abnormal Table Abnormal Bad Disk Full capacity of datanode Region Offline Too much TCP error Inconsistent state between Replication Delay Slow ping two namenodes Too many files CPU hang Too much Xceivers High Meta Load Load too high Disk not mounted Multi Assign Port is unreachable …… …… …… 19
20 .Migration & Backup 20
21 .Migration & Backup Independent with HBase • almost no impact to service • easy to upgrade • support multi versions • support the non-hbase target Second-level RPO Minute-level RTO 21
22 .03 HBase Ecosystem & Multi-model DB & Cloud
23 .Popularity changes per DB category
24 .Ranking scores per category in percent
25 .Data size per day
26 .All in one Key Value Relational Doucument Graph Time Series Geospatial Tabular NoSQL
27 .All in one OpenTSDB GeoMesa HBase Phoenix/AntsDB HBase JanusGraph Key Value Relational Doucument Graph Time Series Geospatial HBase Tabular NoSQL
28 .Multi-model - Native Or Layer HBase Ecosystem DataStax CosmosDB Neo4j InfluxDB CockroachDB PG Multi-model Multi-model KV\Index KV\Index Storage 28
29 .HBase Meet Cloud – Benefits Cloud Native New Hardware Flexibility Cost Savings (TCO) RDMA End up paying for Fast Add/Remove Flash features Resource GPU Flexibility Insight Non-volatile self-driven Fix bugs in time memory Reduce human Self-driven ……