BDS: A data synchronization platform for HBase

这个议题由 Ali-HBase 的数据链路负责人熊嘉男分享。主要介绍云端的跨 HBase 集群数据迁移的设计。对社区 HBase 用户来说,目前跨集群数据迁移最佳的解决方案一定是通过 snapshot 和 replication 配合,分别来完成全量数据和增量数据的迁移。

阿里的 BDS 采用类似的思想,通过多个 worker 来并发拷贝 HFile,实现全量数据的迁移。注意,这个过程是不依赖 Yarn 集群的,而且 BDS 可以通过动态调整 worker来控制整个流程的数据迁移速率,另外迁移时还会尽量考虑目标集群的 locality,是一种对云上用户非常友好的解决方案。
对于导全量过程中产生的增量数据,BDS 是直接去扫 HLog 日志,然后将增量的HLog 写入到对端集群的,整个过程直接访问 HDFS,跟源端的 HBase 集群解耦。
对于云端用户来说,这种方案即可用来做数据迁移,又可以用来做数据备份。将这个功能单独做成一套系统,对用户来说确实是很友好的一个体验。

展开查看详情

1.THE COMMUNITY EVENT FOR APACHE HBASE™

2.BDS: A data synchronization platform for HBase 熊嘉男(侧⽥田) Ali-HBase 数据链路路负责⼈人

3.Requirement

4.• HBase support cross-version migration without downtime? • HBase support data backup to OSS or other storage? • HBase support replicate incremental data to MQ,ES,Solr? • Replicate incremental data from RDS to HBase? • HBase data can be archived to Spark cluster for offline analysis? • HBase High Availability
 …….

5.Challenges

6. HBase clusters Migration Migration Step Defect • Table Structure transformation • Cross-Version Migration • Real-time data replication compatibility issues • Client double write • Impact on Business • HBase Replication • Full data migration • Lack of integrated solutions • DataX • CopyTable • Create Snapshot & Export Snapshot • Data consistency verification

7. Heterogeneous Data Transmission • Heterogeneous full data migration • DataX • Sqoop • Heterogeneous Real-time Data Replication • HBase Real-time Data export • Custom Replication Endpoint • Custom Replication Sink

8.BDS

9. High-Level Architecture & & • Master & Slave • Stateless Slave • Plugin-in mode • Higher scalability and better performance

10.Technical Detail

11. HBase full data migration 3 . 3 2 3 . 3 3 13 35 5 . . 3 . . 3 3 2 4 3 3 43 . 3

12. HBase full data migration • Avoid the impact on business • One-click migration • Only access HDFS • Create table automaticlly • Dynamic migration rate • Perceive changes in region • Decoupled from HBase • Perceive HFiles compaction • Efficient • 100MB/s (single node) • Higher scalability

13. Data localization rate RegionServer Region • Data migration takes the issue of data localization rates into account HFile HFile • Avoid low localization rate after data remote read migration Local read DataNode1 DataNode2

14. File split HFile1 Load • Migration will split HFiles HFile1 Region1 according to the partitions of the HFile2 Split HFile2-1 original and target tables • Increase the speed of bulkload HFile3 HFile2-2 Region2 HFile4 HFile3 HFile4

15. HBase Real-time Replication & &

16. Data pipeline 4 62 352 11 5 3 • Using RingBuffer as a queue • AckQueue maintains offset • Write throughput support dynamic configuration

17. Impact on business HBASE Replication BDS 2 2 4 43 43 2 43 4 2 43 31 2 43 4 2 43 4 43 2 43 31 2 • Decoupled from HBase • Only access HDFS • Read and write affect data replication • Data Replication is not affected by HBase crash

18. Hotspot HBASE Replication BDS 2 2 4 43 2 1 43 32 2 43 4 2 43 31 2 1 1 43 4 43 2 43 4 2 43 31 2 • Hotspot • Round robin scheduling

19. Replication backlog BDS 2 1 • Add slave nodes 32 1 • Slave throughput support 1 dynamic configuration Add Worker nodes 增加Worker节点并发处理理⽇日志的数量量 增加AsyncWriter并发

20. Operation and maintenance •BDS •HBase Replication •Easy to expand •Bug fix •Easy to upgrade •No alarm •monitor •Configuration modification and •alarm mechanism system upgrade requires RS to restart

21.BDS in Ali-Cloud

22.Clusters Migration

23.High Availability --

24.Data Backup

25.Archive data to Spark 1 0

26.RDS

27.About me

28.Thanks!