- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
20_01 Apache Cassandra Sidecar
主要介绍Cassandra中引入Sidecar设计模式,使之更加方便操作
展开查看详情
1 .Apache C* Sidecar let’s make C* attractive and easy to operate Vinay Chella, Dinesh Joshi
2 .Agenda ● Operating C* ● Operating C* with a sidecar ● State of community sidecars ● Lessons learned from operating with sidecars ● Goals of C* management sidecar
3 .Operating C* ● Bootstrap and data movement ● Configuration (files, jmx) ● Maintenance ● Monitoring/Metrics ● Backup/Restore ● Repair
4 .Operating C*: Bootstrapping Create a New Cluster Add/Remove/Replace ● Seeds ● Serial or parallel? ● Token assignment ● Streaming?
5 .Operating C*: Configuration ● Probably Have to Tune a. cassandra.yaml b. topology props c. JVM options ● May Have to Tune a. Logging b. Incremental Backup c. More JVM options
6 . Operating C*: Lifecycle Rolling Restarts (Upgrades) ● Semi-complex single node procedure ● One at a time is too slow ● Token range aware restarts? What happens when Cassandra dies? Ring source @ https://v2.overleaf.com/read/zchtrzskkyjb
7 .Operating C*: Maintenance ● All the Power of JMX ● … So many possibilities a. Many work with jmxterm/jmxsh b. Many only work with Java code ● What if you want to do it on all nodes?
8 .Operating C*: Monitoring ● Many Metrics (good!) ● How to Collect Them? ○ JMX … no ○ Agent! ● Which agent ...
9 .Operating C*: Ring Health ● Cassandra ring health depends on replication ● Strategies ○ Monitor replication of keyspaces ○ Topology Aware ○ Maintenance Aware
10 . Operating C*: Backup/Restore The Cloud ● What even do I need to backup!? ● Restore is legitimately tricky, do you practice?
11 . Operating C*: Repair Datacenter 1 Datacenter 2 “Eventually” Consistent N1 N2 N3 N4 N5 N6 1. Partial Write 0 1 0 0 0 0 2. Read Repair 0 1 1 0 0 0 3. Hints play 0 1 1 0 1 0 … Nope not enough 0 1 1 0 1 0 4. Repair 1 1 1 1 1 1
12 . Sidecar: Bootstrapping Automatic Seed Management using ASGs/db Automatic Instance Replacement Equation+Graph from “Cassandra Availability with Virtual Nodes” by Joey Lynch and Josh Snyder
13 .Operating C* In General
14 .Operating C* In General
15 .What is needed to Operate C*? Separate solutions for ... ● Bootstrap and data movement ● Maintenance ● Configuration (files, jmx) ● Monitoring/Metrics ● Backup/Restore ● Repair
16 .We need better tools!
17 .Community needs
18 .Current state of the art?
19 .CockroachDB
20 .CockroachDB
21 .Operating C* with Sidecar(s) Sidecar
22 .What’s a Sidecar? Sidecars Live Outside Main Daemon Scope sidecar ● Often built for a specific purpose Cassandra metrics-agent ● Typically a different OS ... process
23 .Sidecar: Configuration ● Hierarchy: Environment -> Cluster -> Node ● Flat namespace that is merged to provide Priam config
24 .Sidecar: Configuration ● Hierarchy prod ● Flat namespace that cass_nflx is merged to provide i-08da5d... Priam config ● Functions for defaults (e.g. based on cpu)
25 .Sidecar: Lifecycle Execute Stop Fail Drain with Script Healthcheck timeout (systemd)
26 . Sidecar: Lifecycle Execute Start Ensure Pass Script Health Healthcheck (systemd) Rolling Restarts (Upgrades) ● Cluster automation is now much easier What happens when Cassandra dies? ● Continuous health monitoring and supervision (OOM) ● Priam + systemd + jvmkill1 == pretty good 1 https://github.com/airlift/jvmkill
27 .Sidecar: Maintenance ● JMX methods on cron ● Can add arbitrary tasks like compactions, flushes, etc
28 .Sidecar: Maintenance ● Sidecar provides JMX over HTTP ○ Cleanup ○ Invoke complex JMX methods using curl ○ Many of these are better done scheduled (e.g. repair, compaction, flushes, etc)
29 .Sidecar: Monitoring