20_03 Cassandra at Instagram 2019

介绍了Instagram上的Cassandra使用案例、系统上的提升和接下来的挑战

展开查看详情

1.CASSANDRA @ INSTAGRAM 2019 Dikang Gu -- Instagram

2.ABOUT ME • Apache Cassandra Committer • Engineering manager @ IG • HDFS developer @ FB 2

3.AGENDA 1 Cassandra usage at Instagram 2 Improvements 3 Challenges 3

4.INSTAGRAM • Launched in 2010 • 1B+ Monthly Active Accounts • 500M+ Daily Active Accounts of Stories • IGTV • IG Shopping 4

5.CASSANDRA IN A NUTSHELL

6.WHAT IS CASSANDRA Apache Cassandra is robust, high-performance distributed database.

7.CASSANDRA 7

8.CASSANDRA Node 1 Client Node 4 Node 2 Node 3 8

9.CASSANDRA USAGE @ IG

10.CASSANDRA HISTORY @ IG 10

11.INSTAGRAM DEPLOYMENT ● 1000s of Apache Cassandra instances ● 10s of millions of QPS ● 100s of production use cases ● Petabytes of data ● 5+ Datacenters ● Version 3.0 ● Custom storage engine based on RocksDB 1 1

12.USE CASES

13.USE CASE A Metadata store Application uses Cassandra as persisted metadata storage, they store a list of metadata blobs associated with a key, and do point query or range query during the read time. CREATE TABLE keyspace.t1 ( key BIGINT, t_id INT, value BLOB, PRIMARY KEY (key, t_id) ) 13

14.USE CASE B Time series store This type of applications use Cassandra as time series storage, they store a list of activities into Cassandra, sorted by timestamp. This class of use cases usually have high write throughput, which fits well into Cassandra’s strength. CREATE TABLE keyspace.t2 ( key BIGINT, ts TIMEUUID, value BLOB, PRIMARY KEY (key, ts) ) 14

15.USE CASE C Counter store This type of applications stores distributed Counters into Cassandra. They issue bump or get requests against the storage. CREATE TABLE keyspace.t3 ( key BIGINT, value COUNTER, PRIMARY KEY (key) ) 15

16.ABSTRACTIONS

17.CQL SELECT * FROM keyspace.t1 WHERE key = 1 AND t_id = 2; INSERT INTO keyspace.t1 (key, t_id, value) VALUES (1, 2, "metadata")

18.STORAGE API Put MultiGet Get BatchMutate GetRange Delete 18

19.CLIENTS 19

20.RECAP Cassandra usage at Instagram • Cassandra in a nutshell • C* history and deployments within IG • Typical use cases • Abstractions 20

21.IMPROVEMENTS

22.IMPROVEMENTS • Pluggable Storage Engine • Global Data Partition • Large Scale C* Cluster • Gateway • Manageability 22

23.PLUGGABLE STORAGE ENGINE 23

24.CASSANDRA 24

25.CASSANDRA SINGLE NODE 25

26.CASSANDRA STORAGE ENGINE LAYER 26

27.CASSANDRA-13474 27

28.NEW HIGH PERFORMANCE ENGINE Rocksandra https://github.com/Instagram/cassandra/tree/rocks_3.0 28

29.ROCKSANDRA https://instagram-engineering.com/open-sourcing-a-10x-reduction-in-apache-cassandra-tail-latency-d64f86b43589 29