申请试用
HOT
登录
注册
 
基于Kubernetes搭建HBase在知乎的实践
中国HBase技术社区
/
发布于
/
2512
人观看
基于Kubernetes搭建HBase在知乎的实践
展开查看详情

1 .Building Online HBase Cluster of Zhihu Based on Kubernetes

2 .Agenda • HBase at Zhihu • Using Kubernetes • HBase Online Platform

3 .• HBase at Zhihu • Using Kubernetes • HBase Online Platform

4 .HBase at Zhihu • Offline • Physical machine, more than 200 nodes. • Working with Spark/Hadoop. • Online • Based on Kubernetes, more than 300 containers.

5 .Our online storage • mysql • used in most business • some need scale, some need transform • all SSD expensive • Redis • cache and partial storage • no shard • expensive • HBase / Cassandra / Rocksdb etc. ?

6 .At the beginning • All business at one big cluster • Also runs Nodemanager and ImpalaServer • Basically operation • Physical node level monitor

7 .What we want • From Business Sight • environment isolation • SLA definition • business level monition • From Operation Sight • balance resource ( CPU, I/O, RAM ) • friendly api • controllable costs

8 .In one word: Make HBase as a Service.

9 .• HBase at Zhihu • Using Kubernetes • HBase Online Platform

10 .Zhihu’s Unified Cluster Manage Platfom

11 . Kubernetes • Cluster resource manager and scheduler • Using container to isolate resource • Application management • Perfect API and active community

12 . Failover Design • Component Level • Cluster Level • Data Replication

13 .Component Level • HMaster -> use ZooKeeper • RegionServer -> Stateless designed • ThriftServer -> use proxy • HFile -> ???

14 .Component Level - HFile • Shared HDFS Cluster • Keep the whole cluster stateless

15 .Cluster Level • What if cluster is down ? • Component -> Kubernetes ReplicationSet • What if Kubernetes is down ? • Mixed deployment • Few physical nodes with high CPU && RAM

16 .Data Replication • Replication in cluster • HDFS built in ( 3 replicas) • Replication between clusters • snapshot + bulk load • HBase replication • Offline cluster doing MR / Spark

17 .• HBase at Zhihu • Using Kubernetes • HBase Online Platform

18 .Physical Node Resource • CPU: 2 * 12 core • Memory: 128 G • Disk: 4 T

19 .Resource Definition (1) • Minimize the resource • Business scaled by number of containers • Pros • reduce resource wasted per node • simplified debug • Cons • minimum resource not easy to define by business • hardly tune params for RAMs and GC

20 .Resource Definition (2) • Customize container resource by business • Business scaled by number of containers • Pros • flexible RAM config and tuning ( especially non-heap size ) • used in production

21 .Container Configuration • Params inject to container via ENV • Add xml config to container • Use start-env.sh to init configuration • Modify params during cluster running is permitted

22 .RegionServer G1GC ( thanks Xiaomi ) -XX:+UnlockExperimentalVMOptions -XX:MaxGCPauseMillis=50 -XX:G1NewSizePercent=5 -XX:InitiatingHeapOccupancyPercent=45 -XX:+ParallelRefProcEnabled -XX:ConcGCThreads=2 -XX:ParallelGCThreads=8 -XX:MaxTenuringThreshold=15 -XX:G1OldCSetRegionThresholdPercent=10 -XX:G1MixedGCCountTarget=16 -XX:MaxDirectMemorySize=256M

23 .Network • Dedicated ip per container • DNS register/deregister automatically • Modified /etc/hosts for pod

24 .Manage Cluster • Platform controls cluster • Kubernetes schedule resources • Shared HDFS and ZK cluster • Cons: • fully scan still impact whole cluster • no locality && short circuit holly

25 .Client Design • For Java/Scala • native HBase client • only offer ZK address to business • For Python • happybase • client proxy • service discovery

26 .API Server • Bridge between Kubernetes and business user • Encapsulate component of a HBase cluster • Restful API • Friendly interface

27 . Monitor Cluster • Physical nodes Level • nodes cpu loads && usage ( via IT ) • Cluster Level • pods cpu loads ( via Kubernetes) • read && write rate , P95, cacheHit ( via JMX) • Table Level • client write speed && read latency ( via tracing ) • thrift server ( via JMX ) • proxy concurrency ( via DNS/haproxy monitor )

28 .Current Situation • 10 online business on platform • More than 300 containers • 100% SLA

29 . Benefits • Easy • Isolate • Flexible

0 点赞
2 收藏
4下载
确认
3秒后跳转登录页面
去登陆