HBase at China Telecom


1. hosted by HBase at China Telecom Chen Ze

2. hosted by Content Infrastructure & 01 Application HBase usage at China Telecom 02 Monitor & Optimization How do we monitor HBase and optimization 03 Q&A

3. hosted by Infrastructure & 01 Application HBase usage at China Telecom

4. hosted by Infrastructure Telecom Open platform Cluster Ad-hoc ETL development job scheduling Monitor & Alert …… managerment Offline computing platform Real-time computation platform Hive Pig Mahout SQL MLlib Zookeeper Impala Storm Spark Streaming Kerberos Tez MapReduce Spark YARN HBase Elasticsearch Tachyon HDFS Data exchange Data collecting Sqoop DataX Flume FTP

5. hosted by China Telecom HBase Platform • 322 hosts in a independent HDFS cluster ,32 cores,256GB memory,3.6T*12 disk • 6 HBase clusters with different kinds of application Persistence for Streaming Jobs Online writing/reading Kylin support • 520 TB data ,1 TB/day • HBase1.2.0--CDH 5.12.1

6. hosted by Persistence for Streaming Jobs • core system • data collecting system

7. hosted by Data collecting system Collect different kinds of data with different kinds of collecting method Collecting system CRM/VSOP real-time data CRM /VSOP DPI/signal/CDR OSS Quasi-real-time data Core system MSS DPI/signal/CDR accounts data OSS accounts data … batch data MSS

8. hosted by Data collecting system OSS data collecting use HBase replication to receive data from 31 Provincial branches Provincial branches Shanghai data center HBase replication HBase HBase HDFS HBase

9. hosted by Core System • core system use HBase to store middle layer data and Aggregation layer data through Spark Streaming

10. hosted by Online writing/reading application • DPI data service • Location-based data service

11. hosted by • DPI data service • What is DPI data?

12. hosted by • DPI data service • How to use DPI data ?

13. hosted by • Location-based data service WEB/RestAPI Hbase cluster Mysql Spark cluster FTP Kafka HDFS/Hive DPI/signal/CDR data Data from Base station

14. hosted by • Location-based data service

15. hosted by 02 Monitor & Optimization How do we monitor HBase and optimization

16. hosted by • HBase Monitor • Tools ganglia-3.6.0 Zabbix-3.2.4 • Ganglia for HBase basic metrics • Zabbix for important items to alert

17. hosted by • HBase ganglia configuration Edit HBase config file hadoop-metrics2-hbase.properties Than we can see many metrics items

18. hosted by • Important metrics Type Items System status ping Available memory in percent Disk usage HDFS status NameNode port DataNode port JournalNode port HDFS avaliable space HBase status HBase Master Port HBase Region Server Port Zookeeper status ZooKeeper Port ZKFC Port HBase RPC regionserver.TotalCallTime regionserver.ProcessCallTime regionserver.QueueCallTime regionserver.numActiveHandler regionserver.ipc.numCallsInGeneralQueue regionserver.ipc.numOpenConnections regionserver.RegionServer.numCallsInWriteQueue regionserver.RegionServer.numCallsInReadQueue Hbase IO regionserver.Server.Mutate_99th_percentile regionserver.wal.SyncTime_99th_percentile regionserver.server.Get_99th_percentile regionserver.server.ScanTime_99th_percentile

19. hosted by • Important metrics Type Items HBase region Region num,size BlockCache hit radio JVM GC jvm.JvmMetrics.GcTimeMillis jvm.JvmMetrics.GcCount GC log

20. hosted by • HBase Debug case • One day, our core data system hanged when trying to connection Hbase cluster…

21. hosted by HBase Optimization • CMS vs G1 • Read/Write Splitting and some Optimization

22. hosted by CMS vs G1 CMS to G1 Why?

23.Read/Write Splitting hosted by Optimization • Use replication • Use Kerberos • Optimization hbase.ipc

24.hosted by Q&A

25.hosted by Thanks