- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
HBase Practice at Meituan
展开查看详情
1 . hosted by HBase Practics at Meituan Gehua New Century Hotel Beijing,China August 17,2018
2 . hosted by Content 01 Multi-Tenancy 02 Object Storage 03 Large Query Isolation
3 . hosted by 01 Multi-Tenancy RSGroup - Computing Resource Isolation DNGroup - Storage Isolation Replication Isolation
4 . hosted by Multi-Tenancy RSGroup - Compute Resource Isolation Group_A RegionServer DATA_BLOCK_ENCODING 64 GB BLOOMFILTER Table_A RegionServer COMPRESSION GROUP_NAME : Group_A 64 GB Group_B Table_B RegionServer 128 GB RegionServer 128 GB
5 . hosted by Multi-Tenancy DNGroup - Storage Isolation Group_A DataNode path create time SSD机型 owner DataNode … group_name : Group_A SSD机型 Group_B DataNode SATA机型 DataNode SATA机型
6 . hosted by Multi-Tenancy Replication Isolation Source Cluster Target Cluster RegionServer RegionServer ReplicationSource ReplicationSink Groups Need RegionServer RegionServer Replication ReplicationSource ReplicationSink Other Groups RegionServer RegionServer ReplicationSource ReplicationSink
7 . hosted by Multi-Tenancy Constraint PeerId naming specification • [GROUP]SOURCE_TARGET_INDEX (1) [GROUP] is a keyword (2) SOURCE is the group of source cluster (3) TARGET is the group of target cluster (4) INDEX used to determine the unique identification
8 . hosted by Multi-Tenancy Support Heterogeneous Storage SSD SATA SSD SSD SSD 敏感业务 Other Groups SATA SATA SATA ⼀一般业务
9 . hosted by Multi-Tenancy Comparison with multi cluster deployment Resource can be used flexibly between groups Storage(DN) Compute(RS) 80% 80% 20% 20% 20% 20% Machine A Machine B Machine C Group A Group B
10 . hosted by Multi-Tenancy Some BUGFIX About RSGroup • HBASE-18272 Fix issue about RSGroupBasedLoadBalancer #roundRobinAssignment where BOGUS_SERVER_NAME is involved in two groups • HBASE-20791 RSGroupBasedLoadBalancer#setClusterMetrics should pass ClusterMetrics to its internalBalancer
11 . hosted by 02 Object Storage MOB Solution BUGFIX
12 . hosted by Object Storage YARN LOG Storage • Background 400k applications per day 200 million log files need to upload to HDFS and bring great pressure to NN So consider transferring these log files to HBase storage
13 . hosted by Object Storage MOB Solution 数据块不参与缓存 有效避免内存碎⽚片问题 Client 内存每次flush会⽣生成2个⽂文件 只有索引⽂文件通过线上来托管 memstore flush 拆分、整理 等涉及⼤大量IO的操作 IndexFile IndexFile DataFile DataFile Online Region HDFS DIR
14 . hosted by Object Storage Some Problems About MOB TTL清理存在不同步问题 数据⽂文件清理后索引⽂文件还在 Client 启⽤用MOB特性以后,Region只 能采⽤用默认的整理策略 memstore flush 拆分、整理 等涉及⼤大量IO的操作 IndexFile IndexFile DataFile DataFile Online Region HDFS DIR 线下数据采⽤用单⼀一路径进⾏行存储
15 . hosted by Object Storage MOB - BUGFIX Client 将索引⽂文件的清理 放在数据⽂文件清理之前进⾏行 memstore flush 拆分、整理 等涉及⼤大量IO的操作 IndexFile IndexFile DataFile DataFile Online Region HDFS DIR 线下⽂文件采⽤用多路径进⾏行存储
16 . hosted by Object Storage MOB - BUGFIX • HBASE-19650 ExpiredMobFileCleaner has wrong logic about TTL check • HBASE-19664 MOB should compatible with other types of Compactor in addition to DefaultCompactor
17 . hosted by 03 Large Query Isolation
18 . hosted by Large Query Isolation Large Query Concepts • Characteristic of Large Query (1) Time delay requirements are not very sensitive (2) may occupying Handler thread for a long time • Large Query Types (1) Scan has no startkey or endkey (sucn as full table scan) (2) Client call ResultScanner.next() more than a certain threshold (3) Client call custom coprocessor which involves large queries
19 . hosted by Large Query Isolation Large Query Problems - resource may be run up Client Handler Large Query Handler Common Query Handler Large Query Call Call Call … Large Query Producer Consumer
20 . hosted by Large Query Isolation Implementation Details Client Call Call Call Call Call Call Call Call Call Large Query Priority general Handler … Handler … …
21 . hosted by Large Query Isolation Isolation Effect - Limit the situation of resources run up
22 .hosted by Thanks