HBase Practice at Meituan
展开查看详情
1. hosted by HBase Practics at Meituan Gehua New Century Hotel Beijing,China August 17,2018
2. hosted by Content 01 Multi-Tenancy 02 Object Storage 03 Large Query Isolation
3. hosted by 01 Multi-Tenancy RSGroup - Computing Resource Isolation DNGroup - Storage Isolation Replication Isolation
4. hosted by Multi-Tenancy RSGroup - Compute Resource Isolation Group_A RegionServer DATA_BLOCK_ENCODING 64 GB BLOOMFILTER Table_A RegionServer COMPRESSION GROUP_NAME : Group_A 64 GB Group_B Table_B RegionServer 128 GB RegionServer 128 GB
5. hosted by Multi-Tenancy DNGroup - Storage Isolation Group_A DataNode path create time SSD机型 owner DataNode … group_name : Group_A SSD机型 Group_B DataNode SATA机型 DataNode SATA机型
6. hosted by Multi-Tenancy Replication Isolation Source Cluster Target Cluster RegionServer RegionServer ReplicationSource ReplicationSink Groups Need RegionServer RegionServer Replication ReplicationSource ReplicationSink Other Groups RegionServer RegionServer ReplicationSource ReplicationSink
7. hosted by Multi-Tenancy Constraint PeerId naming specification • [GROUP]SOURCE_TARGET_INDEX (1) [GROUP] is a keyword (2) SOURCE is the group of source cluster (3) TARGET is the group of target cluster (4) INDEX used to determine the unique identification
8. hosted by Multi-Tenancy Support Heterogeneous Storage SSD SATA SSD SSD SSD 敏感业务 Other Groups SATA SATA SATA ⼀一般业务
9. hosted by Multi-Tenancy Comparison with multi cluster deployment Resource can be used flexibly between groups Storage(DN) Compute(RS) 80% 80% 20% 20% 20% 20% Machine A Machine B Machine C Group A Group B
10. hosted by Multi-Tenancy Some BUGFIX About RSGroup • HBASE-18272 Fix issue about RSGroupBasedLoadBalancer #roundRobinAssignment where BOGUS_SERVER_NAME is involved in two groups • HBASE-20791 RSGroupBasedLoadBalancer#setClusterMetrics should pass ClusterMetrics to its internalBalancer
11. hosted by 02 Object Storage MOB Solution BUGFIX
12. hosted by Object Storage YARN LOG Storage • Background 400k applications per day 200 million log files need to upload to HDFS and bring great pressure to NN So consider transferring these log files to HBase storage
13. hosted by Object Storage MOB Solution 数据块不参与缓存 有效避免内存碎⽚片问题 Client 内存每次flush会⽣生成2个⽂文件 只有索引⽂文件通过线上来托管 memstore flush 拆分、整理 等涉及⼤大量IO的操作 IndexFile IndexFile DataFile DataFile Online Region HDFS DIR
14. hosted by Object Storage Some Problems About MOB TTL清理存在不同步问题 数据⽂文件清理后索引⽂文件还在 Client 启⽤用MOB特性以后,Region只 能采⽤用默认的整理策略 memstore flush 拆分、整理 等涉及⼤大量IO的操作 IndexFile IndexFile DataFile DataFile Online Region HDFS DIR 线下数据采⽤用单⼀一路径进⾏行存储
15. hosted by Object Storage MOB - BUGFIX Client 将索引⽂文件的清理 放在数据⽂文件清理之前进⾏行 memstore flush 拆分、整理 等涉及⼤大量IO的操作 IndexFile IndexFile DataFile DataFile Online Region HDFS DIR 线下⽂文件采⽤用多路径进⾏行存储
16. hosted by Object Storage MOB - BUGFIX • HBASE-19650 ExpiredMobFileCleaner has wrong logic about TTL check • HBASE-19664 MOB should compatible with other types of Compactor in addition to DefaultCompactor
17. hosted by 03 Large Query Isolation
18. hosted by Large Query Isolation Large Query Concepts • Characteristic of Large Query (1) Time delay requirements are not very sensitive (2) may occupying Handler thread for a long time • Large Query Types (1) Scan has no startkey or endkey (sucn as full table scan) (2) Client call ResultScanner.next() more than a certain threshold (3) Client call custom coprocessor which involves large queries
19. hosted by Large Query Isolation Large Query Problems - resource may be run up Client Handler Large Query Handler Common Query Handler Large Query Call Call Call … Large Query Producer Consumer
20. hosted by Large Query Isolation Implementation Details Client Call Call Call Call Call Call Call Call Call Large Query Priority general Handler … Handler … …
21. hosted by Large Query Isolation Isolation Effect - Limit the situation of resources run up
22.hosted by Thanks