19/06 - Safer restarts, faster streaming, and better repair, jus

Safer Restarts, Faster Streaming, and Better Repair, Just a Glimpse of Cassandra 4.0
展开查看详情

1.Safer Restarts, Faster Streaming, and Better Repair, Just a Glimpse of Cassandra 4.0 Vinay Chella

2.What do you want your databases to have? ● Data Integrity ● Quick responses ● Ease of use Confidential Confidential©© DataStax, All Rights Reserved.

3.With Cassandra 4.0 You don’t have to pick just one... You can get it all !! Confidential Confidential©© DataStax, All Rights Reserved.

4.Who Am I? Vinay Chella Apache Cassandra Committer Cloud Data Architect Cloud Database Engineering @ Netflix Confidential Confidential©© DataStax, All Rights Reserved.

5.Disclaimer There are lot of exciting features coming in 4.0, but this talk covers some of the features that we at Netflix are particularly excited about and looking forward to. There are thousands of improvements shipping soon in 4.0, and some in later 4.x. This is just a sample of the goodness. Confidential Confidential©© DataStax, All Rights Reserved.

6. ● C* Usage @ Netflix ● Why 4.0? ○ Reliability ○ Performance ■ Internode messaging Agenda ■ Zstd compression ○ Data Integrity and compliance ■ Repair improvements ■ Audit Logging Confidential Confidential©© DataStax, All Rights Reserved.

7.C* Usage @ Netflix ● 10’s of thousands of Apache C* nodes ● 100’s of clusters ● Spanned across AWS regions ● 10’s of millions of operations / sec ● Source of truth for 99%+ streaming persistence data Confidential Confidential©© DataStax, All Rights Reserved.

8.Why 4.0? Reliability C* 4.0 for Netflix Data Integrity Performance & Compliance Confidential Confidential©© DataStax, All Rights Reserved.

9.Reliability Your database should be available (aka fast)

10.Internode Networking (async n/w) CASSANDRA-8457 & CASSANDRA-15066 ● No more thread per peer, fully async server-server communication ● Streaming 20% faster (12229) ● Access to critical OS networking features Confidential Confidential©© DataStax, All Rights Reserved.

11.Client backpressure CASSANDRA-15013 ● Client backpressure for the first time in Cassandra ● Clients can go crazy but C* wont react ● No more OOM on Cassandra* Confidential Confidential©© DataStax, All Rights Reserved.

12.Restarting Cassandra ● Gossip slows down (8457, 12966) ● Restarted nodes coordinate before they have functional connections (13993, 14297) ● Non-restarted nodes will continue sending on dead connections for a while (14358) ● DynamicEndpointSnitch sends to latent nodes after restart (14459) Confidential Confidential©© DataStax, All Rights Reserved.

13.Restarting Cassandra in 4.x Confidential Confidential©© DataStax, All Rights Reserved.

14.Some Other Improvements ● Meet SLOs with Hybrid Speculation (14293) ○ MIN(99PERCENTILE,10MS) ~= “only speculate if I am slower than P99 SLO” ○ MAX(99PERCENTILE,100MS) ~= “stop speculating if the cluster is hosed” ● Reduce default number of vnodes (13701) ○ vnodes reduce availability, better to have fewer. (mailing list) ○ Context: https://github.com/jolynch/python_performance_toolkit/tree/master/notebooks/cassand ra_availability ● Which queries are slow/huge? (13001, 14347) ● Circuit break queries of death (12106) Confidential Confidential©© DataStax, All Rights Reserved.

15.Performance Your database should respond quickly

16.Internode streaming ● Framing ● Correctness ● Resilience ● Efficiency ● Resource Limits Confidential Confidential©© DataStax, All Rights Reserved.

17.Internode streaming (Write heavy benchmark) ● Regions: 2 ● Nodes: 96/ region ● Instance type: i3.xlarge ● Memory to disk ratio: 1:7 ● ReadCL: LOCAL_ONE ● WriteCL: LOCAL_ONE Confidential Confidential©© DataStax, All Rights Reserved.

18.Internode streaming When systems are under heavy load - Multi region C* cluster Confidential Confidential©© DataStax, All Rights Reserved.

19.Internode streaming When systems are under moderate load - Multi region C* cluster Confidential Confidential©© DataStax, All Rights Reserved.

20.Internode streaming 4.0 vs 3.0.17 ( Write benchmark) Coordinator metrics ● mean latencies are ~14x better ● 99th latencies are ~4x better ● 95th latencies are ~6x better More detail on this benchmark - benchmark results and CASSANDRA-14747 Confidential Confidential©© DataStax, All Rights Reserved.

21.Internode streaming (Read heavy benchmark) ● Regions: 2 ● Nodes: 6/region ● Instance type: i3.2xlarge ● Per node data: ~180 GiB ● Partition size: 4KiB ● ReadCL: LOCAL_ONE Confidential Confidential©© DataStax, All Rights Reserved.

22.Internode streaming 4.0 vs 3.0.17 ( Read heavy benchmark) Coordinator metrics - CASSANDRA-15066 ● mean latencies are ~7x better ● 99th latencies are ~6x better ● 95th latencies are ~4x better More detail on this benchmark - benchmark results and CASSANDRA-14747 Confidential Confidential©© DataStax, All Rights Reserved.

23.Internode messaging is not just reliable but it is faster Follow more updates on CASSANDRA-15066 Confidential© DataStax, All Rights Reserved.

24.Zero Copy Streaming - CASSANDRA14556 ● Streaming is used in ○ Bootstrap ○ Repair ○ Rebuild ○ Range movement ○ Cluster expansion Confidential Confidential©© DataStax, All Rights Reserved.

25.Zero copy streaming performance tests ● Data size per node: 500GB ● 6-node clusters ○ i3.xl ○ i3.2xl ○ i3.4xl ○ i3.8xl Confidential Confidential©© DataStax, All Rights Reserved.

26.Zero Copy Streaming Confidential Confidential©© DataStax, All Rights Reserved.

27.Streaming is ~5x faster on Apache Cassandra 4.0 and…better yet Much more increased Availability Confidential Confidential©© DataStax, All Rights Reserved.

28.Zstd compression - CASSANDRA-14482 Compressor name Ratio Compression Decompress. zstd 1.4.0 -1 2.884 530 MB/s 1360 MB/s zlib 1.2.11 -1 2.743 110 MB/s 440 MB/s lz4 1.8.3 2.101 800 MB/s 4220 MB/s snappy 1.1.4 2.073 580 MB/s 2020 MB/s lzf 3.6 -1 2.077 440 MB/s 930 MB/s Confidential Confidential©© DataStax, All Rights Reserved.

29.Zstd - 4.Next (with zstd dictionaries) Confidential Confidential©© DataStax, All Rights Reserved.