HBase 吞吐量提升实践

下载 4

快召唤伙伴们来围观吧
微博 QQ QQ空间 贴吧
文档嵌入链接
<iframe src="https://www.slidestalk.com/HBaseGroup/Lift_the_ceiling_of_HBase_throughputs66149?embed" frame border="0" width="640" height="360" scrolling="no" allowfullscreen="true">复制
微信扫一扫分享
已成功复制到剪贴板

中国HBase技术社区

发布于

6年前

2203

人观看

#信息技术

HBase 吞吐量提升实践

展开查看详情

1 . Lift the Ceiling of Throughputs Yu Li, Lijin Bin {jueding.ly, tianzhao.blj} @alibaba-inc.com

2 . Agenda n  What/Where/When l  History of HBase in Alibaba Search n  Why l  Throughputs mean a lot n  How l  Lift the ceiling of read throughputs l  Lift the ceiling of write throughputs n  About future

3 . HBase in Alibaba Search n  HBase is the core storage in Alibaba search system, since 2010 n  History of version used online l  2010~2014: 0.20.6à0.90.3à0.92.1à0.94.1à0.94.2à0.94.5 l  2014~2015: 0.94à0.98.1à0.98.4à0.98.8à0.98.12 l  2016: 0.98.12à1.1.2 n  Cluster scale and use case l  Multiple clusters, largest with more than 1,500 nodes l  Co-located with Flink/Yarn, serving over 40Million/s Ops throughout the day l  Main source/sink for search and machine learning platform

4 . Throughputs mean a lot n  Machine learning generates huge workloads l  Both read and write, no upper limit l  Both IO and CPU bound n  Throughputs decides the speed of ML processing l  More throughputs means more iterations in a time unit n  Speed of processing decides accuracy of decision made l  Recommendation quality l  Fraud detection accuracy

5 . Lift ceiling of read throughput n  NettyRpcServer (HBASE-17263) l  Why Netty? n  Enlightened by real world suffering l  HBASE-11297 n  Better thread model and performance l  Effect n  Online RT under high pressure: 0.92msà0.25ms n  Throughputs almost doubled

6 . Lift ceiling of read throughput n  NettyRpcServer (HBASE-17263) l  Why Netty? n  Enlightened by real world suffering l  HBASE-11297 n  Better thread model and performance l  Effect n  Online RT under high pressure: 0.92msà0.25ms n  Throughputs almost doubled

7 . Lift ceiling of read throughput (con’t) n  RowIndexDBE (HBASE-16213) l  Why n  Seek in the row when random reading is one of the main consumers of CPU n  All DBE except Prefix Tree use sequential search. l  How n  Add row index in a HFileBlock for binary search. (HBASE-16213) l  Effect n  Use less CPU and improve throughput， KeyValues<64B, increased >10%

8 . Lift ceiling of read throughput (con’t) n  End-to-end read path offheap l  Why n  Advanced disk IO capability cause quicker cache eviction n  Suffering from GC caused by on-heap copy l  How n  Backport E2E read-path offheap to branch-1 (HBASE-17138) n  More details please refer to Anoop/Ram’s session l  Effect n  Throughput increased 30% n  Much more stable, less spike

9 . Lift ceiling of read throughput (con’t) n  End-to-end read path offheap l  Before l  After

10 . Lift ceiling of write throughput n  MVCC pre-assign (HBASE-16698, HBASE-17509/17471) l  Why n  Issue located from real world suffering: no more active handler n  MVCC is assigned after WAL append n  WAL append is designed to be RS-level sequential, thus throughput limited l  How n  Assign mvcc before WAL append, meanwhile assure the append order l  Original designed to use lock inside FSHLog (HBASE-16698) l  Improved by generating sequence id inside MVCC existing lock (HBASE-17471) l  Effect n  SYNC_WAL throughput improved 30%，ASYNC_WAL even more (>70%)

11 . Lift ceiling of write throughput (cont’d) n  Refine the write path (Experimenting) l  Why n  Far from taking full usage of IO capacity of new hardware like PCIe-SSD n  WAL sync is IO-bound, while RPC handling is CPU-bound l  Write handlers should be non-blocking: do not wait for sync l  Respond asynchronously n  WAL append is sequential, while region puts are parallel l  Unnecessary context switch n  WAL append is IO-bound, while MemStore insertion is CPU-bound l  Possible to parallelize?

12 . Lift ceiling of write throughput (cont’d) n  Refine the write path (Experimenting) l  How n  Break the write path into 3 stages l  Pre-append, sync, post-sync l  Buffer/queue between stages n  Handlers only handle pre-append stage, respond in post-sync stage n  Bind regions to specific handler l  Reduce unnecessary context switch

13 . Lift ceiling of write throughput (cont’d) n  Refine the write path (Experimenting) l  Effect (Lab data) n  Throughput tripled: 140K à 420K with PCIe-SSD l  TODO n  Currently PCIe-SSD IO util only reached 20%, much more space to improve n  Integration with write-path offheap – more to expect n  Upstream the work after it’s verified online

14 . About Future n  HBase is still a kid – only 10 years’old l  More ceilings to break n  Improving, but still long way to go n  Far from fully utilizing the hardware capability, no matter CPU or IO l  More scenarios to try n  Embedded-mode (HBASE-17743) l  More to expect n  2.0 coming, 3.0 in plan n  Hopefully more community involvement from Asia l  More upstream, less private

15 .Q & A Thank You!

0点赞

0收藏

4下载