HBase at Tencent

来自腾讯的工程师程广旭为我们带来了 HBase 在腾讯的业务中的应用场景和经验。
腾讯目前有90多个 HBase 集群,最大的集群有500多台节点。腾讯内部多个业务包括腾讯视频,微信支付和腾讯云等都在使用 HBase 服务。其首先分享了使用 HBase 进行数据迁移的经验:Replication 和 ExportSnapshot.在实际使用中,业务每天的数据量都很大,这些数据需要保存的周期要么很大,要么很小。因此采取了按天分表的方式,也就是每天会创建一个新的表,对于过期的数据,直接把当天的表删掉即可。其次分享了对带宽的优化。

写入 HBase 的流量主要有五个部分:

  • 写入

  • WAL

  • Flush

  • Small Compaction

  • Major Compaction


  • 开启 CellBlock 的压缩。

  • WAL 的压缩。

  • 增大 memstore,减少 Flush,减少 Compaction.

  • 减少 Compaction 的线程数目。

  • 关闭 Major Compaction.

  • 按天建表。

最后介绍了如何共享 RestServer.当每个 HBase 集群搭建一个 RestServer 时,如果读取集群的请求很少,那么集群的 RestServer 资源比较浪费。腾讯做了一个改进,配置一个 RestServer 可以访问多个 HBase 集群,同时在 MySQL 里记了哪些表可以通过这种方式访问。



2.HBase At Tencent Andrew Cheng | 程广旭 Tencent | HBase Committer

3.Content 01. HBase Service In Tencent 02. Applications 03. Practices & Optimization

4.01. HBase Service In Tencent

5.HBase Story in Tencent l Began using since 2013 l Used version l 0.94.17 -> 0.98.6 -> 1.2.5 -> 2.2.0 (ing) l Largest cluster more than 500 nodes 90+ 4000+ 10PB+ 3Tri+ Clusters Nodes Data RPD

6.Overview HBase Users come from 6 groups , more than 100+ different applications

7.Architecture Phoenix Advertiseme Deploy Center Tenpay Wepay Game … nt OpenTSDB Doss TDBank Spark Tookit RestServer S2Graph Lhotse HBase Api ThriftServer monitoring Kylin Tencent HBase Zookeeper TNM2

8.02. Applications

9.Tencent Ads – Real-Time Logjoin System Data Source Mixer Exposure Click … Transport TDBank Logical LogJoin LogJoin LogJoin LogJoin Association Table Storage Tencent HBase Flow Table Consumer Model learning Freshness Budget control Report

10.Tenpay - Transaction record Data Source MySQL Application Application Binlog Paser DBSync C++ JAVA Read Cache Hippo Read Thrift Server Write TDSort Read Write Storage Tencent HBase

11.03. Practices & Optimization

12.Practices–Data migration Business-insensitive data migration Cluster A Cluster B add_peer delete_snapshot Client switch to new cluster disable_peer enable_peer Check Data Set REPLICATION_SCOPE => '1' Set REPLICATION_SCOPE => '0' snapshot clone_snapshot ExportSnapshot

13.Practices–Table l Create table per day l Large amount of data l TTL is short l Benefit l Reduce the amount of data in compaction l Easy to delete expired data

14.Optimization - Bandwidth Input Data Input Data Input Data ① Input Data ② RS2 and RS3 Wal data ① ② ③ RS2 and RS3 Flush data Wal Flush ③ Small compact ④ ④ RS2 and RS3 Small compact RS1 RS2 RS3 Large compact ⑤ ⑤ RS2 and RS3 Large compact

15.Optimization - Bandwidth l Enable compressing of CellBlocks l Wal compressor l Increase the size of memstore l Reduce the number of threads about compaction l Turn off major compaction l create tables by day

16.Optimization - Online filtering of dirty data l A large amount of data which have the same Rowkey l How to find filter rowkeys? Input Data l ResponseTooSlow Yes l How to set filter rowkeys? Filter Yes l hbase.hregion.filter.rowkeys Filter Enable l How to refresh filter rowkeys? No l update_config No Write

17.Optimization - Prefix Bloom Filter(HBASE-20636) l ROWPREFIX_FIXED_LENGTH Create Table: l ROWPREFIX_DELIMITER uin ts action Prefix File info: Bloom Filter

18.Optimization - Prefix Bloom Filter(HBASE-20636) Write Read Input Data Get prefix key by Scan prefix_length Rowkey Get prefix key by prefix_length Computer hash {StartKey,EndKey} value Computer hash value No Same No prefix? Hit Filter StoreFile BloomFilter? Set BloomFilter Yes Prefix length Yes Yes No >= Last line? prefix_length Yes No Not Filter StoreFile Write BloomFilter information to StoreFile metadata

19.Optimization - RestServer User Nginx RestServer A RestServer B RestServer C RestServer D Cluster A Cluster B Cluster C

20.Optimization - RestServer User Nginx Mysql RestServer A RestServer B RestServer C Cluster A Cluster B Cluster C

21.Optimization - RestServer l Only maintain one configuration l use effectively resources l User-friendly access

22.HBase Community l 1 Committer, 2 Contributor l Total commits: 80+ l Feature l HBASE-20636 Introduce two bloom filter type : ROWPREFIX_FIXED_LENGTH and ROWPREFIX_DELIMITED l HBASE-19799 Add web UI to rsgroup l HBASE-20243 [Shell] Add shell command to create a new table by cloning the existent table l HBASE-19483 Add proper privilege check for rsgroup commands l ………

