基于Kubernetes搭建HBase在知乎的实践

基于Kubernetes搭建HBase在知乎的实践
展开查看详情

1.Building Online HBase Cluster of Zhihu Based on Kubernetes

2.Agenda • HBase at Zhihu • Using Kubernetes • HBase Online Platform

3.• HBase at Zhihu • Using Kubernetes • HBase Online Platform

4.HBase at Zhihu • Offline • Physical machine, more than 200 nodes. • Working with Spark/Hadoop. • Online • Based on Kubernetes, more than 300 containers.

5.Our online storage • mysql • used in most business • some need scale, some need transform • all SSD expensive • Redis • cache and partial storage • no shard • expensive • HBase / Cassandra / Rocksdb etc. ?

6.At the beginning • All business at one big cluster • Also runs Nodemanager and ImpalaServer • Basically operation • Physical node level monitor

7.What we want • From Business Sight • environment isolation • SLA definition • business level monition • From Operation Sight • balance resource ( CPU, I/O, RAM ) • friendly api • controllable costs

8.In one word: Make HBase as a Service.

9.• HBase at Zhihu • Using Kubernetes • HBase Online Platform

10.Zhihu’s Unified Cluster Manage Platfom

11. Kubernetes • Cluster resource manager and scheduler • Using container to isolate resource • Application management • Perfect API and active community

12. Failover Design • Component Level • Cluster Level • Data Replication

13.Component Level • HMaster -> use ZooKeeper • RegionServer -> Stateless designed • ThriftServer -> use proxy • HFile -> ???

14.Component Level - HFile • Shared HDFS Cluster • Keep the whole cluster stateless

15.Cluster Level • What if cluster is down ? • Component -> Kubernetes ReplicationSet • What if Kubernetes is down ? • Mixed deployment • Few physical nodes with high CPU && RAM

16.Data Replication • Replication in cluster • HDFS built in ( 3 replicas) • Replication between clusters • snapshot + bulk load • HBase replication • Offline cluster doing MR / Spark

17.• HBase at Zhihu • Using Kubernetes • HBase Online Platform

18.Physical Node Resource • CPU: 2 * 12 core • Memory: 128 G • Disk: 4 T

19.Resource Definition (1) • Minimize the resource • Business scaled by number of containers • Pros • reduce resource wasted per node • simplified debug • Cons • minimum resource not easy to define by business • hardly tune params for RAMs and GC

20.Resource Definition (2) • Customize container resource by business • Business scaled by number of containers • Pros • flexible RAM config and tuning ( especially non-heap size ) • used in production

21.Container Configuration • Params inject to container via ENV • Add xml config to container • Use start-env.sh to init configuration • Modify params during cluster running is permitted

22.RegionServer G1GC ( thanks Xiaomi ) -XX:+UnlockExperimentalVMOptions -XX:MaxGCPauseMillis=50 -XX:G1NewSizePercent=5 -XX:InitiatingHeapOccupancyPercent=45 -XX:+ParallelRefProcEnabled -XX:ConcGCThreads=2 -XX:ParallelGCThreads=8 -XX:MaxTenuringThreshold=15 -XX:G1OldCSetRegionThresholdPercent=10 -XX:G1MixedGCCountTarget=16 -XX:MaxDirectMemorySize=256M

23.Network • Dedicated ip per container • DNS register/deregister automatically • Modified /etc/hosts for pod

24.Manage Cluster • Platform controls cluster • Kubernetes schedule resources • Shared HDFS and ZK cluster • Cons: • fully scan still impact whole cluster • no locality && short circuit holly

25.Client Design • For Java/Scala • native HBase client • only offer ZK address to business • For Python • happybase • client proxy • service discovery

26.API Server • Bridge between Kubernetes and business user • Encapsulate component of a HBase cluster • Restful API • Friendly interface

27. Monitor Cluster • Physical nodes Level • nodes cpu loads && usage ( via IT ) • Cluster Level • pods cpu loads ( via Kubernetes) • read && write rate , P95, cacheHit ( via JMX) • Table Level • client write speed && read latency ( via tracing ) • thrift server ( via JMX ) • proxy concurrency ( via DNS/haproxy monitor )

28.Current Situation • 10 online business on platform • More than 300 containers • 100% SLA

29. Benefits • Easy • Isolate • Flexible

为了让众多HBase相关从业人员及爱好者有一个自由交流HBase相关技术的社区,阿里巴巴、小米、华为、网易、京东、滴滴、知乎等公司的HBase技术研究人员共同发起了组建中国HBase技术社区。