大数据备份对于很多应用而言至关重要,来自阿里的工程师分享了AliHB如何进行实时的冷数据备份,其中碰到的关键问题也进行了深入分析。

注脚

展开查看详情

1.AliHB Real-Time Cold data Backup 孟庆义(mengqingyi)

2.⽬目录 Content 01 HBase Backup State Alibaba’s requirements 02 on Backup AliHB Real-Time Cold 03 data Bakcup 04 Future works

3. HBase Backup State Against Against User Hardware Application RPO RTO failure error Snapshot NO YES N/A N/A Replication YES NO seconds seconds Increase with HBase Backup Restore YES YES minutes data size AliHB Real-Time cold data backup YES YES seconds minutes

4.Alibaba’s requirements for Backup •  RPO < 1minutes •  Predictable RTO for PB scale data •  Low Cost •  NO affect on Online service •  Easy Management

5.AliHB Real-Time Cold data Backup •  Real-Time incremental backup •  Independent with HBase -  No need for snapshot •  Stateless worker node •  Backup in heterogeneous Storage maintained by another team

6.Backup Overview Backup Cluster Source Cluster Target Cluster(pangu) Full backup HFile HFile HFile HFile HFile HFile Region Copy Increment backup Log Log Log Log Log Log Log Tracker Log Copy

7.Full Backup •  Job copy for a table •  Task copy for a region •  Challenge: region’s file list keep changing -  Compaction remove old files -  Split remove the entire region -  Merge remove the entire region

8.Compaction •  At first we have file 1,2,3,4,5 •  When copy 4, found it missing •  Refresh list we have 1,2,6 •  Copy 6 Copy File 1 2 3 4 5 Compaction 1 2 6

9.Split •  We are the parent region -  Found region missing, reload meta and resubmit tasks •  We are the child region -  Copy the reference file and it’s original file -  If referenced file missing, refresh the file list and continue •  Merge works like split

10.Algorithm start No Yes All files Select next File copied Copy file file exist? ? Yes No Yes Refresh Region file list exist? No Reload meta end and submit new task

11.Incremental Backup Source Cluster Backup Cluster Register new log <logName, state, offset> Log Zookeeper Tracker HBase Scan logs Copy log HDFS Worker Worker Worker Latency < 10 seconds

12.Log Lifecycle •  Writing -  Log Tracker period scan and find new logs •  Closed -  If not the latest log of the region server or in the “.oldlogs” •  Finished -  If worker has copied the whole closed Log •  Deleted -  If Log Tracker can not find it in HBase and it’s finished on backup, then delete the log record on backup system

13.Data Consistence •  Full comparison -  Do sample comparison -  Sample on every region -  Balanced sample, use index of the largest file for each region •  Incremental comparison -  Compare recent logs

14.Restore Scenes •  Cluster Level -  Restore the whole cluster •  Table Level -  Restore one or list of tables •  Region Level -  Restore ranged data of some table •  Restore to given time point

15.Restore Tools •  Bulkload the full backup -  Filter hfiles by table name and range •  Use LogRestore tool to restore logs -  Filter by table name -  Filter by range -  Filter by timestamp

16.Restore Runtime •  HFiles -  Split by region, one region one task •  Logs Restore Manager -  Each log is a task Submit tasks Bulkload Worker Worker Worker LogRestore

17.Real-Time Cold data Backup Master Log Backup Tracker Manager Data Restore Cleaner Manager Worker Worker Worker Copy Copy Log Region Log Bulkload Restore

18.WEB UI

19.Performance Backup System 200Nodes 110TB data backup 22minutes Restore 53minutes HBase 377Nodes

20.Conclusion •  AliHB Real-time Cold data backup -  Realtime incremental backup keep the latency in seconds -  Scale out ability to obtain more power on restore -  Use less resources on normal backup -  Independent with HBase, easy to deploy and upgrade

21.Future works •  Incremental Restore -  Recognize Hot / Cold Data -  Resume the hbase service after Restore hot data -  Access the cold data through reference file -  Background restore cold data •  Put log lifecycle manage on HBase -  Period scan on .oldlogs cause pressure on NN -  Keep only the necessary logs on zookeeper •  Compact hlogs to Hfile -  Save storage space -  Speed up restore

22.谢谢观看 Thanks

23.

24.

user picture
为了让众多HBase相关从业人员及爱好者有一个自由交流HBase相关技术的社区,阿里巴巴、小米、华为、网易、京东、滴滴、知乎等公司的HBase技术研究人员共同发起了组建中国HBase技术社区。

相关文档