Cassandra cql以及业务场景介绍

Cassandra cql以及业务场景介绍

展开查看详情

1.Cassandra Query Language (CQL) && 业务使⽤用场景介绍 郭超(⽞玄陵) 2019/10/10

2.Content • Cassandra简单使⽤用介绍 • 概况简介 • 组成 • CQL的简单case • Cassandra query language(CQL) • Data Type • DDL && DML • Secondary Index && MV • Security && Trigger • 应⽤用:使⽤用案列列介绍 • Q &A

3.Cassandra 使⽤用简单介绍

4.Cassandra 概况简介 • Cassandra是什什么? • Cassandra :分布式、⽀支持类SQL的CQL、⾼高可⽤用⾼高性能、share-nothing 、NOSQL • Apache 顶级开源项⽬目 • Cassandra核⼼心能⼒力力 • CQL能⼒力力 • 丰富数据结构,native type,collection,UDT • ⼆二级索引 && 物化视图 • 算术运算、JSON、安全认证 && UDA • 可调⼀一致性 && 轻量量级事务 (兼顾可⽤用性以及⼀一致性) • DHT && gossip(easy deploy) • ⾼高性能、⾼高可⽤用

5.Cassandra 简单组成 NativeTransportServer/MessagingService A CQL Parser Gossip D B Replication Meta management management Commitlog/Memtable/ C DHT Sstable/Index ext4/xfs/…

6.Thrift -> CQL Thrift CQL get SELECT - 从常⻅见的kv 操作转向丰富的类SQL操作, get_slice 亲和性更更⾼高,使⽤用简洁 - 提供丰富的数据结构:text,varchar,udt等 INSERT get_range_slice - 使⽤用起来更更像数据库 - 4.0 不不在support thrift,CQL是趋势 BATCH insert Others batch_mutate others

7.Cql 与kv 接⼝口分别写⼀一条数据 INSERT INTO testtable (name, age, address) VALUES (‘xl’, 18, ‘hz’); CQL VS HBase kv api HBase: Put put = new Put(“xl”.getBytes()); put.addColumn(“testtable”.getBytes(), “address”.getBytes(), “hz".getBytes()); put.addColumn(“testtable”.getBytes(), “age”.getBytes(), “18".getBytes()); table.put(put);

8.Cassandra Query Language(CQL)

9.CQL 组成 Data Type DDL DML Secondary Index Materialized View CQL Security Functions(scalar/aggregate) Arithmetic Operators(算术运算) JSON Triggers Others

10. CQL Data Type cql_type = native_type | collections | udt | tuple |custom_type 数据结构丰富,便便于业务直接使⽤用,⽆无需封装,eg:inet,⽤用户使⽤用ip直接使⽤用⽆无需转换 Native Type

11.CQL Data Type Native Type 基本的string类型(text,varchar…)、数值类型(int,bigint,smallint,…)、ip相关类型 (inet)、UUID、时间⽇日期(time,date,timestamp…)、counter example: CREATE TABLE tests ( name varchar PRIMARY KEY , age int , ipaddress inet , ts timestamp , uuid uuid ); Collections map<cql_type,cql_type>, set<cql_type>,list<cql_type> example: CREATE TABLE test.collection( name text PRIMARY KEY , buyThingNameToPrice map <text, double>, visitPlaces set <text>, eatHistoryList list <text>);

12.CQL Data Type UDT - UDT必须在keyspace范围内创建; - ⽀支持:CREATE /ALTER /DROP; example: CREATE TYPE singleExpenseRecord ( shopName text , buyThingName text , price text ); CREATE TABLE personExpenseRecords( name text PRIMARY KEY , expenseTime timestamp , expenseInfo singleexpenserecord ); DROP TYPE singleexpenserecord ; Tuple - 不不同类型的固定⼤大⼩小的set集合类型,主要特点是固定⼤大⼩小 example: CREATE TABLE tmptable (name text PRIMARY KEY , tupleCollection tuple<int, text>); INSERT INTO tmptable (name , tuplecollection ) VALUES ( 'gc', (18, ‘hz'));

13.DDL CREATE USE ALTER DROP DESCRIBE KEYSPACE KEYSPACE KEYSPACE KEYSPACE KEYSPACE CREATE ALTER DROP DESCRIBE TRUNCATE TABLE TABLE TABLE TABLE Keyspace: - replcation: - class: SimpleStrategy/NetworkTopologyStrategy(适合多dc数据复制、灾备等情况) - replication_factor;// 3副本quorum常⻅见在线请求对⼀一致性有要求;2副本场景one 级别,节省成本,低⼀一致性要求,⽐比如物联⽹网时序数据 example: CREATE KEYSPACE ks1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true; CREATE KEYSPACE ks2 WITH replication = {'class': 'NetworkTopologyStrategy', 'dc1': '1', 'dc2': '2'} AND durable_writes = true;

14.DDL Table: - Primary key: - 标识数据存储地址以及顺序⼀一,旦创建⽆无法改变; - Partition key + cluster key(如果存在) - 设置primary key 需要考虑: - partition的⼤大⼩小; - partition内的数据顺序; - 集群partition分布; example: CREATE TABLE testtable1 (name text PRIMARY KEY , age int , address text );//姓名,年年纪,对应年年龄住址 CREATE TABLE testtable2 (orderId uuid PRIMARY KEY , name text, orderPriceTodate map <int, timestamp> ); //购买订单,订单名字,订单的价格以及产⽣生的时间戳 CREATE TABLE testtable3 (name text , age int , address text , PRIMARY KEY (name, age));

15.DML DML:INSERT/ SELECT/UPDATE/DELETE/BATCH Select : - ⽀支持常规获取指定列列数据; - ⽀支持LIMIT /ORDEY BY/ASC、DESC; - ⽀支持返回结果执⾏行行函数(count,max,min,sum,avg) - ⽀支持cast操作 - ⽀支持JSON example: CREATE TABLE testkeyspace.testtable ( name text, age int, address text, PRIMARY KEY (name, age) ) SELECT count(*) FROM testkeyspace.testtable; SELECT sum(age) FROM testkeyspace.testtable; SELECT name , address ,age FROM testtable LIMIT 10; SELECT * FROM testtable WHERE age > 10 AND age < 100 AND name = 'cj' ORDER BY age ASC ; SELECT JSON name, age ,address FROM testtable ;

16.DML INSERT : - ⽀支持常规写数据;primary key必须写⼊入数据; - ⽀支持JSON;⽅方便便业务直接使⽤用; - ⽀支持 IF NOT EXIST (CAS) - ⽀支持TTL(秒粒度);过期数据⾃自动删除,⽐比如推荐系统业务 example: INSERT INTO testtable (name, age, address) VALUES ( 'xl', 18 , ‘hz'); INSERT INTO testtable JSON '{"name":"gdb", "age":11, “address":"lj"}'; INSERT INTO testtable (name, age, address) VALUES ( 'gc', 10, 'zg') USING TTL 86400;

17.DML UPDATE : - ⽀支持常规写数据;primary key 需要在where语句句设定;其他列列 使⽤用set 设置值; - 不不检查之前存在的key(除了了⽤用if,将会检查条件是否满⾜足) - ⽀支持TTL、带上TIMESTAMP; example: UPDATE testtable USING TTL 9000 SET address = 'zj' WHERE name = 'gc' AND age = 10; DELETE : - ⽀支持删除⾏行行、某列列 - ⽀支持WHERE⾥里里⾯面包含 IN; example: DELETE from testtable WHERE name IN ( 'gc');

18.DML BATCH : - 以BEGIN BATCH 开始,APPLY BATCH 结束,中间封装多条 modification - ⼀一定程度节省⽹网络交互开销; - 默认LOGGED模式,保证最终BATCH操作完全完成; example: BEGIN BATCH INSERT INTO testtable (name, age, address) VALUES ( 'gcc', 19, 'hz'); INSERT INTO testtable (name, age, address) VALUES ( 'gdcd', 20, 'sh'); UPDATE testtable SET address = 'sh' WHERE age = 19 AND name = 'gcc'; DELETE FROM testtable WHERE age = 19 AND name = 'gcc'; APPLY BATCH ;

19.Secondary index Native secondary index: - 在某列列上建索引;列列的value基数⾼高不不推荐使⽤用,基数适中较好 - 已有数据会异步构建索引,新⼊入数据会⾃自动原⼦子化插⼊入; - counter 列列⽆无法使⽤用 - 某列列频繁删除以及update不不推荐使⽤用 - ⽀支持多重索引查询,⽀支持在collection上建索引; example: CREATE TABLE testtable (name text, age int , address text ,PRIMARY KEY(name, age)) CREATE INDEX testindex ON testtable ( address) ; CREATE INDEX testindexage ON testtable ( age) ; SELECT * FROM testtable WHERE address = ‘bj’ AND age = 28 ALLOW FILTERING ;

20.Secondary index (SASI)Sstable Attached Secondary Index:experimental - ⽀支持模式较多:prefix,contains,sparse等 - 在string上⾯面⽀支持prefix,contains等功能,类似SQL的like ‘f%’ 和 like ‘%f%’ - sparse适⽤用于⾼高密度⼤大量量的数值型数据的query,列列如时序数据 - 使⽤用analyzer可以对某列列的⽂文本数据做词⼲干分析,and or关键词过滤等 - 不不⽀支持collection example: CREATE CUSTOM INDEX fn_prefix ON testtable ( address ) USING 'org.apache.cassandra.index.sasi.SASIIndex'; SELECT * FROM testtable WHERE address like 'b%'; CREATE CUSTOM INDEX test_contains on testtable (address) USING ‘org.apache.cassandra.index.sasi.SASIIndex’ WITH OPTIONS = { 'mode': 'CONTAINS'}; SELECT * FROM testtable WHERE address like '%b%'; SELECT * FROM testtable WHERE address > 'a' AND address < 'c' ALLOW FILTERING ; CREATE CUSTOM INDEX test_contains on testtable (age) USING 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = { 'mode': 'SPARSE'}; SELECT * FROM testtable WHERE age < 100 ; CREATE CUSTOM INDEX test_analyzer ON testtable (address) USING 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = { 'mode': 'CONTAINS', 'analyzer_class': 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer', 'analyzed': 'true', 'tokenization_skip_stop_words': 'and, the, or', 'tokenization_enable_stemming': 'true', 'tokenization_normalize_lowercase': 'true', 'tokenization_locale': 'en' }; INSERT INTO testtable (name, age, address) VALUES ( 'gxx', 11, 'hz china'); INSERT INTO testtable (name, age, address) VALUES ( ’gdd', 11, 'hz china'); SELECT * FROM testtable WHERE address like 'china';

21.Materialized View (MV) Materialized View :experimental - ⼀一个新的索引表,数据来源于源数据表; - 适合对于那些基数较⾼高的数据列列; - 创建mv需要内部制定的select模块,新的primary key ; - 新的primary key 需要包含⽼老老数据表的primary key列列; example: CREATE MATERIALIZED VIEW testkeyspace.test_view AS SELECT * FROM testkeyspace.testtable WHERE address IS NOT NULL AND name IS NOT NULL AND age IS NOT NULL PRIMARY KEY (address, name, age); SELECT * FROM test_view ;

22.Security - ⽀支持账户密码; - 各个级别的鉴权; example: CREATE USER ll WITH PASSWORD ‘ll' NOSUPERUSER ll@cqlsh> DROP KEYSPACE cycling ; Unauthorized: Error from server: code=2100 [Unauthorized] message="User ll has no DROP permission on <keyspace cycling> or any of its parents” cassandra@cqlsh> GRANT drop ON keyspace cycling TO ll; ll@cqlsh> DROP KEYSPACE cycling ; ll@cqlsh>

23.Triggers - ⽀支持triggers的能⼒力力,需要⽤用户⾃自⼰己添加相关trigger代码; example: CREATE TRIGGER test_trigger ON testtable USING ’trigger'; DROP TRIGGER test_trigger on testtable

24.使⽤用案列列介绍

25. feed流 元信息 Instagram 计数 在线 360 处理理 消息 账户信息 apple 其他 饿了了么 ⽤用户画像 Netflix 订单系统 Others 视频、⽂文本

26.欢迎加⼊入我们的技术社区,⼀一起交流! 微信公众号 钉钉社区群 阿⾥里里云Cassandra正在公测,欢迎申请试⽤用:https://www.aliyun.com/product/cds

27.Q&A