HBase在阿里巴巴的优化及实践

HBase在阿里巴巴的优化及实践
展开查看详情

1. HBase: Recent Improvement And Practice At Alibaba Allan Yang(Alibaba/HBase Committer) Han Yang (Alibaba) Confidential & Proprietary

2. Agenda  HBase at Alibaba – Typical scenarios  Running Architecture – Range data copy – Dual Service  SQL – Performance and feature improvements Confidential & Proprietary 2

3. HBase At Alibaba LOG Chat Monitor Trade IoT Logistics Search … 100 million+ TPS 10,000+ nodes PB+ data HBase Confidential & Proprietary 3

4. Double 11 Festival Data Service Low latency query Data Src Message Real-time Trade Middleware Computing Order HBase LOG 1GB/s + 1 Million + … throughput TPS HBase HBase Confidential & Proprietary 4

5. Risk Management of Ant Financial Risk Console People Real Time Query Action ENV Incremental Event Real-Time export Offline import HBase Computing Daily result import Method Time Expire data based on: • TTL 10TB+ data per day • Version • Low value column Confidential & Proprietary 5

6. Deployment Architecture Client Dual Service Async Replication HBase HBase Range data copy HDFS HDFS 2 replicas 1 replica Confidential & Proprietary 6

7. Range Data Copy Split copy job Grab RS Write to sub-tasks Bulkload Master ZK RS HDFS Cluster2 RS  A feature provided inside HBase, fully distributed, no MR  On the fly, no need to stop service  Recoverable from all kinds of error and disaster Confidential & Proprietary 7

8. Range Data Copy Scenarios HBase  Data Center relocation IDC1 IDC2 Relocation  Historical data Movement  Data Recovery Replication HBase HBase Other Solution 1. CopyTable Historical Data Move  Too too slow (scan table)  Need MR role HBase HBase 2. Snapshot (Backup in HBase2.0) ×  Need disable table when restore HDFS HDFS  No control of data range  Not design for data migration Data Recovery Confidential & Proprietary 8

9. Dual Service – Why? Region split, balance, RS down … GC Network HDFS Possible Solution: Region Replicas (HBASE-10070)  Need internal replication  Need triple disk space  Replica region is not writable Confidential & Proprietary 9

10. Dual Service • Take advantage of slave cluster • No extra resources needed Request Response Request to Glitch Request to Master Timeout Slave Callback Select the Async processer first return Master Replication Slave HBase HBase Confidential & Proprietary 10

11. Dual Service - Benchmark  Let’s call Request with RT > 50ms a ‘spike’  Set Glitch Timeout = 40ms(call slave if running after 40ms)  Spike rate – Before Dual Service: Requests with RT>50ms / Total request – After Dual Service: (The proportion of request > 40ms in master) * (The proportion of request > 10ms in slave) W/O Dual Service W/ Dual Service Spike rate 0.047095% 0.001714% Confidential & Proprietary 11

12. Why SQL? Easy and Quick to use HBase Schema Rich data typing 1 Data Type Rowkey construction Semantics 2 Basic/complex queries 3 Optimize Query with index 4 Optimize transparently Target Existing tools Confidential & Proprietary

13. Phoenix Phoenix JDBC Driver ZooKeeper Service HBase Client HBase Master Service RegionServer RegionServer RegionServer Phoenix Phoenix Phoenix Coprocessor Coprocessor Coprocessor HDFS Confidential & Proprietary 13

14. Performance against HBase API 3 • Single row select/Scan 2.6 • Single row upsert/Put 2.5 2.3 2 1.5 1.5 Phoenix 1 HBase API 0.7 0.5 0 (ms) Read Write Confidential & Proprietary 14

15. Why is UPSERT much slower? UPSERT statement Cost around 1ms UpsertCompiler MutationState Update meta cache Meta Region hit RS Table#batch(mutations) Data table region Confidential & Proprietary 15

16. What is Meta Cache? • Meta data of a Phoenix table in each Client • Schema – Columns, types, properties • Indexes – Add new index – Drop an index Confidential & Proprietary 16

17. Meta Cache Update Policy • Init at the 1st time • Update meta periodically(PHOENIX-2520) • Update meta at mistakes • Version for each meta update – Request with meta version – Server always has the latest version Confidential & Proprietary 17

18. Lift UPSERT performance 3 Reduce 38% latency 2.6 2.5 2 1.6 1.5 Phonenix 1.5 alhb-sql 1 HbaseAPI 0.5 0 (ms) write Confidential & Proprietary 18

19. Lift SELECT performance JDBC Driver QueryCompiler -1ms: update meta cache once -0.5ms: use small scan QueryPlan setCaching Parallel Scan use single scan Spooling Spooling no prefetch Confidential & Proprietary 19

20. Parallel or Single? • Use small scan properly • Use parallel scan unless it's necessary Scenarios Phoeinx's plan alihb-sql's plan full table scan parallel big scan single big scan single row select parallel big scan single small scan single region range scan single big scan single small scan cross region range scan parallel big scan single big scan aggregation parallel big scan parallel big scan Confidential & Proprietary 20

21. Lift SELECT performance select * from tt where a = 10; Reduce 65% latency 2.5 2.3 2 1.5 Phoenix 1 0.8 0.7 alihb-sql 0.5 HbaseAPI 0 (ms) Read Confidential & Proprietary 21

22. Improved performance 3 2.6 2.5 2.3 38% 65% 2 1.6 1.5 PHoenix 1.5 alihb-sql 1 0.8 0.7 HBaseAPI 0.5 0 (ms) Read Write Confidential & Proprietary 22

23. Secondary Indexing Data Table Local Index Global Index 1 a a 1 a 1 Region1 2 d a 3 a 3 3 a d 2 c 5 4 f c 5 d 2 Region2 5 c f 4 f 4 RegionServer1 RegionServer2 select * from tt where (pk between 1 and 3) and col = 3; select * from tt where col = 3; Confidential & Proprietary 23

24. How Global Index Works? Write RPC Write RPC Handler Handler preBatch Read data table Build index updates syncLog data table edits index edits postBatch Commit Index updates syncLog index edits DataTable RS IndexTable RS Confidential & Proprietary 24

25. Consistency: Index Updates Failure Write RPC Write RPC Handler Handler preBatch Read data table IndexTable Build index updates Region/RS syncLog data table edits Not Available index edits postBatch Commit Index updates syncLog index edits DataTable RS Confidential & Proprietary 25

26. Solution I: Disalbe Index writing Write RPC Write RPC Handler Handler preBatch Read data table IndexTable Build index updates Region/RS syncLog data table edits Not Available index edits postBatch Commit Index updates syncLog index edits DataTable RS Confidential & Proprietary 26

27. Solution I: Disable Index writing • Update meta (may cause chain collapse) – Set index state to DISABLE – Set index disable timestamps • Query degenerated to full table scan over data table Confidential & Proprietary 27

28. Solution I: What If Update Meta Failed? Write RPC Write RPC Write RPC Handler Handler Handler preBatch syncLog Update postBatch Index Update Meta Abort Confidential & Proprietary 28

29. Solution II: Disalbe Data Table Writing Write RPC Write RPC Handler Handler preBatch Read data table IndexTable Build index updates Region/RS syncLog data table edits Not Available index edits postBatch Commit Index updates syncLog index edits Raise exception DataTable RS Confidential & Proprietary 29

为了让众多HBase相关从业人员及爱好者有一个自由交流HBase相关技术的社区,阿里巴巴、小米、华为、网易、京东、滴滴、知乎等公司的HBase技术研究人员共同发起了组建中国HBase技术社区。