2019-04-02 Flink Forward -Pulsar + Flink

展开查看详情

1. Elastic Data Processing with Apache Flink and Apache Pulsar Sijie Guo (sijieg) 2019-04-02

2. Who am I • Apache Pulsar PMC Member • Apache BookKeeper PMC Member • Interested in technologies around Event Streaming

3. Agenda • What is Apache Pulsar? • A Pulsar View on Data - Segmented Stream • Pulsar - Access Pattern & Tiered Storage • Pulsar - Schema • When Flink meets Pulsar

4.What is Apache Pulsar?

5. Pub/Sub Messaging 2003 2010 2012 2006 2011

6. “Flexible Pub/Sub messaging backed by durable log/stream storage”

7.Pulsar - Pub/Sub

8.Pulsar - Multi Tenancy

9.Pulsar - Queue + Streaming

10.Pulsar - Cloud Native Layered Architecture • Independent Scalability • Instant Failure Recovery • Balance-free on cluster expansions

11.A Pulsar View on Data

12.Batch - HDFS

13.Stream - Pub/Sub

14.A Flink View on Computing “Batch processing is a special case of Stream processing”

15.Pulsar = Segmented Stream

16. Topic Producers Topic Consumers Time

17.Partitions P0 Producers P1 P2 Consumers P3 Time

18. Segments P0 Segment 1 Segment 2 Segment 3 Producers P1 Segment 1 Segment 2 Segment 3 Segment 4 P2 Segment 1 Segment 2 Segment 3 Consumers P3 Segment 1 Segment 2 Segment 3 Time

19. Stream P0 Segment 1 Segment 2 Segment 3 Producers P1 Segment 1 Segment 2 Segment 3 Segment 4 P2 Segment 1 Segment 2 Segment 3 Consumers P3 Segment 1 Segment 2 Segment 3 Time

20. Stream Producers Stream Segment 1 Segment 2 Segment 3 Segment 4 Consumers Time

21. Segmented Stream • Segmented Stream Systems • Apache Pulsar, Twitter EventBus, EMC Pravega • All Apache BookKeeper based • Used BK in a different way • Pulsar, EventBus - Uses BK as the segment store • Pravega - Uses BK as the journal only

22. Access Patterns ✓ Write
 写 Stream Segment 1 Segment 2 Segment 3 Segment 4 ✓ Tailing Read
 追尾 Time ✓ Catchup Read
 追赶读

23. Access Patterns ✓ Write
 Write 写 Stream Segment 1 Segment 2 Segment 3 Segment 4 ✓ Tailing Read
 追尾 Time ✓ Catchup Read
 追赶读

24. Access Patterns Tailing Read ✓ Write
 Write 写 Stream Segment 1 Segment 2 Segment 3 Segment 4 ✓ Tailing Read
 追尾 Time ✓ Catchup Read
 追赶读

25. Access Patterns Tailing Read ✓ Write
 Write 写 Stream Segment 1 Segment 2 Segment 3 Segment 4 ✓ Tailing Read
 追尾 Time Catchup Read ✓ Catchup Read
 追赶读

26. Write Kafka Pulsar Broker Broker Broker Broker Broker Broker Partition Partition Partition (Leader) (Follower) (Follower)

27. Tailing Read Kafka Pulsar Broker Broker Broker Broker Broker Broker Partition Partition Partition (Leader) (Follower) (Follower)

28. Catchup Read Kafka Pulsar Broker Broker Broker Broker Broker Broker Partition Partition Partition (Leader) (Follower) (Follower)

29. IO Isolation Kafka Pulsar Broker Broker Broker Broker Broker Broker Partition Partition Partition (Leader) (Follower) (Follower)