21_01 kafka-connector

介绍了Cassandra + Kafka几种组合模式

展开查看详情

1. Better Together: Apache Cassandra and Apache Kafka 1

2. Agenda 1 Apache Cassandra and Apache Kafka 2 Better Together – Common Patterns 3 DataStax Kafka Connector 4 Demonstration 5 Resources 2 © DataStax, All Rights Reserved. Confidential

3. Your Presenters 3 © DataStax, All Rights Reserved.

4. Agenda 1 Apache Cassandra and Apache Kafka 2 Better Together – Common Patterns 3 DataStax Kafka Connector 4 Demonstration 5 Resources 4 © DataStax, All Rights Reserved. Confidential

5. Apache Cassandra Overview • First developed by Facebook • Top-level Apache project since 2010 • Partitioned row store • Distributed, decentralized • Elastic scalability / high performance • High availability / fault tolerant • Tuneable consistency • Cassandra Query Language (CQL) 5 © DataStax, All Rights Reserved. Apache Cassandra ® Apache Software Foundation

6. Apache Kafka Overview • First developed by LinkedIn • Top-level Apache Project since 2012 • Distributed streaming platform • Used for real-time data pipelines and streaming applications • Horizontal scalability / high performance • High availability / Fault tolerance • Stream persistence and querying (KSQL) • Connect framework 6 © DataStax, All Rights Reserved. Apache Kafka ® Apache Software Foundation

7. Kafka Concepts • Topics – Collection of key/value pairs – Append-only – Can be partitioned • Producers • Consumers – Separate offsets 7 © DataStax, All Rights Reserved.

8. Kafka Concepts • Streams applications – Combined Producer/Consumer • KSQL – Query language used by stream applications 8 © DataStax, All Rights Reserved.

9. Kafka Concepts • Brokers • Clusters • Connect Framework – Sources – Sinks 9 © DataStax, All Rights Reserved.

10. Cassandra + Kafka – Similarities and Distinctives • Concepts in common • Cassandra excels at… – Distributed Systems – High volume, write intensive data storage workloads at scale – Partitioning / Hashing – Suitable as a system of record – Replication • Slight differences in implementation – High performance searching via DSE – Multi-DC • Kafka excels at… – Log-structured – Streaming data to/from services and legacy data sources – TTL / retention – Acting upon changes in data from multiple sources (aka pipelines) 10 © DataStax, All Rights Reserved.

11.Better Together – using the best of both +

12. Agenda 1 Apache Cassandra and Apache Kafka 2 Better Together – Common Patterns 3 DataStax Kafka Connector 4 Demonstration 5 Resources 12 © DataStax, All Rights Reserved. Confidential

13. Pattern 1: Cassandra + Kafka in Microservices • Consume • Publish to topic(s) topic(s) Some My Other Producer microservice consumers • Read / write data DataStax Enterprise 13 © DataStax, All Rights Reserved.

14.Cassandra + Kafka – KillrVideo Example KillrVideo Services Suggested User Management, Video Videos Catalog, Ratings Service • UserCreated • YouTubeVideoAdded • UserRatedVideo • Populate graph • Read and • Graph recommender write data traversal DataStax Enterprise DSE Graph

15. Pattern 2: Kafka into Cassandra 15 © DataStax, All Rights Reserved. Confidential

16. Agenda 1 Apache Cassandra and Apache Kafka 2 Better Together – Common Patterns 3 DataStax Kafka Connector 4 Demonstration 5 Resources 16 © DataStax, All Rights Reserved. Confidential

17. Why a Kafka Connector ? Geolocation System of records Topic: stocks-ticks … Account & Product Usage Mobile Device ClickStream Event Sources

18. Why a Kafka Connector? • Spark Streaming = PULL ➢ Enable advanced transformations and computations ➢ Mode Pull with a dedicated runtime (poll) • Kakfa Connector = PUSH ➢ No extra runtime 18

19. What is the Kafka Connector ? Geolocation System of records Topic: stocks-ticks … Account & Product Usage Sources Sinks HERE Mobile Device Kafka Connect ClickStream Event Sources

20. What is the Kafka Connector? Sources Sinks • Automatically ingest from Kafka to DSE – Simple, Fast, Flexible, Secure HERE Kafka Connect • Deployed in the Kafka Connect framework – Managed through the built-in REST API • Visibility into running connectors and tasks • Endpoints for operator tasks – Automatic rebalancing • Useful for availability and scaling 20

21. What is the Kafka Connector? Sources Sinks • Built by DataStax drivers team HERE – Best practices for writing to DSE – Resiliency of DS drivers Kafka Connect 21

22. What is the Kafka Connector? reads <standalone-worker>.properties workers or Connectors list <distributed-worker>.properties instantiate Data Sinks <connector>.properties DataStax Connector read or Config (mapping) <connector>.json start Kafka Connect 22

23. Agenda 1 Apache Cassandra and Apache Kafka 2 Better Together – Common Patterns 3 DataStax Kafka Connector 4 Demonstration 5 Resources 23 © DataStax, All Rights Reserved. Confidential

24. Academy.datastax.com/downloads 24

25. What versions does this work with? DSE 5.0+ Supported Versions Confluent Apache Kafka - DSE 5.0+ - Confluent 3.2+ 3.2.x+ 0.10.2.x+ - Apache Kafka 0.10.2+ 3.3.x+ 0.11.0.x+ Supported offerings 4.0.x+ 1.0.x+ - DS Enterprise 4.1.x+ 1.1.x+ - DS Basic - DDAC 5.0.x+ 2.0.x+ 25

26. Docker and Datastax • WHERE – https://hub.docker.com/u/datastax/ – https://github.com/datastax/docker- images/tree/master/datastax-docker-image- examples • We provide – Dockers images for DSE, studio, Opscenter – Docker-compose configuration files – Sample Deployments • We support – Installation on dev before 6.7 – Installation on prod from 6.7 (December 2018) 26 Confidential

27. https://github.com/clun/kafka-dse/tree/driver2 27 © DataStax, All Rights Reserved. Confidential

28. Demonstration Overview 8083 KAFKA-CONNECT Sources Sinks DSE Alpha Vantage 9092 TickGenerator StockTickProducer timer Java-dse-driver 2181 Kafka-dse-producer Spring Web Flux Kafka-dse-webui © DataStax, All Rights Reserved.

29. Agenda 1 Apache Cassandra and Apache Kafka 2 Better Together – Common Patterns 3 DataStax Kafka Connector 4 Demonstration 5 Resources 29 © DataStax, All Rights Reserved. Confidential