20_10 How Netflix Manages Petabyte Scale Apache Cassandra In The Cloud

本文介绍Netflix如何管理PB级的云端Cassandra数据库

展开查看详情

1.How Netflix manages petabyte scale Apache Cassandra in the cloud Joey Lynch, Vinay Chella Netflix’s Distributed Database Engineers

2.Who are we? Vinay Chella Joey Lynch Distributed Systems Engineer Distributed Systems Engineer Focusing on Apache Cassandra and Data Distributed system addict and data wrangler Abstractions Cloud Data Engineering Cloud Data Engineering Netflix Netflix

3.Agenda Why use Cassandra? Scale of Cassandra Life of Cassandra Cluster - Where does it start? - Provisioning - Keep it running - Migration / Retiring Murphy’s law applied

4.Why — Millions of operations per sec Apache — Global data replication Cassandra — Failure isolation at rack level — Chaos ready database — Tunable consistency — Log structured storage engine

5.Scale — 10’s of thousands instances — 100’s of global C* clusters — >6 PB of data — Millions of requests / second — Replicating several GiB/sec data across the globe

6.Story of Apache Cassandra and Netflix Inception Provision Keep it running Migrations

7.Inception Where does it all start?

8.Inception Inception

9.Service philosophy — Context not control ◆ Education ◆ Tooling — SLOs are key ◆ Size, rate, latency, availability — Every party must be responsible Inception

10.Inception Inception

11.Invest in DevEd Inception

12.Inception Inception

13.Better Tooling

14.Cost insights Inception

15.Maintenance Inception

16.Maintenance - Repair Insights Inception

17.

18.

19.Maintenance - Backup Insights

20.

21.

22.Maintenance - Node Insights Inception

23.

24. SLOs are Key Inception

25.

26.Whom to page? Inception

27.Good contracts make good partners! Inception

28.Story of C* Inception Keep it running Migrations Provision

29.Provision Get up and running fast!