Phoenix Best Practice In China Life Insurance Company

来自中国人寿的工程师袁利鸥为我们分享了 Phoenix 在中国人寿的最佳实践。
中国人寿目前总的节点数有200多个,Phoenix 集群的节点是30多个。集群整体的数据量是1300T,HBase 单表最大30T,每天大概会有上百个脚本运行。

Phoenix 在中国人寿的应用场景:数据源是从核心的交易系统上产生,然后通过SharePlex,打到 Kafka 上,数据从 Kafka 实时接入到 Phoenix 集群上,通过查询服务,为 APP 提供权益信息访问。从物理架构上看,最底层是 Phoenix 集群,向上有两条链路,一条是 Phoenix Gateway,另一条是实时查询服务,通过负载平衡,承接到 Weblogic 集群上。

袁利鸥介绍了 Spark Streaming 的设计:

(1)对于整合后的表,会加入一些控制字段,记录更新时间、删除还是插入操作等。

(2)实时同步程序,按照表名或者统计字段做区分。

袁利鸥接着介绍了关于 Phoenix 的优化,把 Phoenix 的系统表做为一个分组,数据表放在另一个分组中。客户端访问时,每天会拉去一次元数据,随后就不用去访问 Phoenix 系统表,可以降低负载。基于 HBase 的一些优化包括:

  • Region Balance By Table。

  • G1GC

  • Manual MajorCompaction

展开查看详情

1.

2.Phoenix Practice in China Life Insurance Co., Ltd. Leo Yuan yuanliou2015@e-chinalife.com

3. 1 Scenarios 2 Designs Agenda 3 Optimizations 4 Problems 5 Future Work

4.Scenarios - Overview Readonly cluster B Sync result Real Time Query Cluster E Main Processing Sync Application Procession basic Cluster A data Cluster C

5.Scenarios - Overview 4 Cluste200+ Nodes(30+ Phoenix Nodes) Clusters 1300TB+ data Data 30TB+ biggest table Hundreds of MR/Hive/Spark jobs per day Processing 50TB+ Incremental data for update & insert Ten million level perday Querys

6.Scenarios - Overview step 2 query operation detail from app This Is What We Do Real Cust Rights View System ! step 1 get some money from company counter

7.Scenarios - Overview 30,000,000 + Incremental data/day 700,000 + users / day 10,000+ records/sec 8,000+ sqls / sec

8. 1 Scenarios 2 Designs Agenda 3 Optimizations 4 Problems 5 Future Work

9.Designs - Data Architecture China Life Insurance APP ü Initialize Data Cust View System Spark Kafka ü Build Phoenix Index Streaming ü Sync Real Time Data Phoenix ü Provide Data Service Business HBase SharePlex System

10.Designs - Development Architecture Privilege Schedule Ultimate Real Time Query Service Control Data Service Monitor Query Distribution Engine Resource schedule User Access Control monitor V_pay table … V_constract table V_cust service table V_claim table Data View (4+3) (4+3) (4+3) (4+3) Data Integration Cross-province Integration dim Resource access Job Schedule control Data Exchange Initial Data Sync Program Incremental Data Sync Program Monitor Data Source Business System DataBase Real Time Data Link

11.Designs - Physical Architecture APP Server F5 Configuration DB Phoenix Gateway Weblogic Server Weblogic Server Weblogic Server Phoenix Cluster

12.Designs - Real Time Data Sync cid c_no type amount branch syssource updtime incr_flag commit_time …… 001 001 M 100 000000 V6 2019-06-11 16:37.322 1 2019-06-11 16:37.322 …… 02 002 R 100 000001 V7 2019-06-11 16:38.689 2 2019-06-11 16:38.689 …… 1、Partition By Global Primary Key SparkStreaming_compt SparkStreaming 2、Shield Upstream System Table Structure Adjustment 3、No Effect on Normal Stream Process when Data Supplement Contract(BeiJing) Contract(ShangHai) cid c_no type amount cid c_no type amount … Kafka … …… 01 001 M 100 02 002 R 100

13.Designs - Ultimate Real Time Query Service Chinalife Insurance APP Ultimate Real Time Query Service Analyze label.properties Parameters label name label type isHolder logic Package Sqls paidBonus sum holder sql1,2,3 paidMoenyList list holder sql4,5 Execue Sqls paidExpire sum Insured sql6 … … … … Collect Results

14. 1 Scenarios 2 Designs Agenda 3 Optimizations 4 Problems 5 Future Work

15.Optimizations - Sql Execution Process 1、SYSTEM.CATALOG 2、SYSTEM.STATS DriverManager.getConnection("phoenixUrl") 3、SYSTEM.LOG 4、 SYSTEM.SEQUENCE con.prepareStatement(sql) 1、Query meta data of table/index from phoenix server(SYSTEM.CATALOG) 2、Determine the table/index sql need to scan pstat.executeQuery() 3、Query statistics information of table/index from phoenix server(SYSTEM.STATS) 4、Generate scans based on statistics information、meta data、sql rs.next() 1、Parallelity decided by phoenix.query.threadPoolSize

16.Optimizations – Phoenix System Table SYSTEM.CATALOG Describe table/index meta information, such as l TABLE NEME l COLUMN NAME l SALT_BUCKETS l UPDATE_CACHE_FREQUENCY l GUIDE_POST_WIDTH SYSTEM.STATS Describe table/index accurate statistics information, such as l GUIDE_POST_KEY l GUIDE_POSTS_WIDTH PS:UPDATE STATISTICS TABLE_NEME

17.Optimizations - RS Group • Metadata Table Isolation to Decrease Impact on Business Table Query Hmaster Hmaster RS Group 1 RS Group 2 rs1 rs2 rs3 rs4 … rs1 rs2 rs3 rs4 … SYSTEM:CATALOG SYSTEM:STATS SYSTEM:CATALOG INCOME INCOME CONTRACT SYSTEM:STATS SYSTEM:MUTEX CONTRACT PERSON …… SYSTEM:STATS PERSON ……

18.Optimizations - UPDATE_CACHE_FREQUENCY • Adjust this parameter to decrease hotspot in SYSTEM.CATALOG p Decide the query frequency of SYSTEM.CATALOG p Default value is “Always” and will Cause read/write pressure in SYSTEM.CATALOG p Can be set per Cluster/Table 1. “phoenix.default.update.cache.frequency”: 86412345 2. create table test.test (a varchar not null primary key,b varchar ) SALT_BUCKETS = 10, UPDATE_CACHE_FREQUENCY=86400000;

19.Optimizations – Salt & Pre-Split • Data table use salt_buckets, Index table use pre-split Phoenix Client Phoenix Client write query CREATE INDEX … 1、Create index to ON …( ensure that all of query 1、Use CREATE TABLE …( process stop at index … salted ) INCLUDE ( table table to 2、use pre split to reduce )SALT_BUCKETS = 60 … the chunk number when distribute ) ASYNC SALT_BUCKETS execute a query data evenly =0 3、use async index to SPLIT ON ( avoid OOM when … building the index table )

20.Optimizations - Open Offheap • Minimal read cost to improve query efficiency BucketCache Configuration Properties l hbase.bucketcache.combinedcache.enabled l hbase.bucketcache.ioengine l hfile.block.cache.size l hbase.bucketcache.size l hbase.bucketcache.bucket.sizes l -XX:MaxDirectMemorySize

21.Optimizations - Other Configurations • Region Balance By Table • G1GC • Manual MajorCompaction

22. 1 Scenarios 2 Designs Agenda 3 Optimizations 4 Problems 5 Future Work

23.Problems - Current time function cause Query performance degradation • This function will lead to client-cluster interaction frequently SELECT … FROM … ① WHERE date <= SELECT … CURRENT_TIME() FROM … replace CURRENT_TIME WHERE date <= ② In java code to NOW_DATE CURRENT_TIME() SELECT … FROM … ③ WHERE date <= NOW_DATE

24. Problems - HBase cluster balance abnormal process Problems Resolve 1. balancer abnormal org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer 1.Turn on RSGroup 2.Set two RSGroup default 、my_group 3.Move rs to default and my_group 4.Restart one rs in default 2. balance_rsgroup 'default' abnormal

25.Problems - ACL abnormal when come together with RSGroup process Problems Resolve 1.Turn on RSGroup 1, Use hbase user to create table 1. Non-hbase user can’t creat table 2, grant this table 'RWX’ to Non-hbase user 2.Turn on ACL <property> <name> hbase.coprocessor.master.classes </name> <value> org.apache.hadoop.hbase.security. access.AccessController, org.apache.hadoop. hbase.rsgroup.RSGroupAdminEndpoint </value> </property>

26. 1 Scenarios 2 Designs Agenda 3 Optimizations 4 Problems 5 Future Work

27.Future Work - RPC Read/Write Isolation total queue = hbase.ipc.server.callqueue.handler.factor * handler read queue = total queue * hbase.ipc.server.callqueue.read.ratio write queue = total queue * (1- hbase.ipc.server.callqueue.read.ratio ) scan queue = = total queue * hbase.ipc.server.callqueue.read.ratio * hbase.ipc.server.callqueue.scan.ratio total queue read queue scan write queue queue

28.Future Work - Compaction Contral Open Compaction Controller l key:hbase.regionserver.throughput.controller value:org.apache.hadoop.hbase.regionserver.compactions. PressureAwareCompactionThroughputController Set Offpeak Time l hbase.offpeak.end.hour l hbase.offpeak.start.hour l hbase.hstore.compaction.throughput.offpeak Set Peak Time Throughput l hbase.hstore.compaction.throughput.higher.bound l hbase.hstore.compaction.throughput.lower.bound

29.Future Work - Join Optimization SELECT … FROM ( a1 JOIN (SELECT... EXPLAIN FROM b1 WHERE …) a2 ON a1…. = a2…) WHERE …