- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
Transaction Preview of Apache Pulsar
展开查看详情
1 .Transaction Support in Pulsar Penghui Li Apache Pulsar PMC Member Yong Zhang Apache Pulsar Contributor
2 .What is Apache Pulsar?
3 .Pub/Sub Messaging
4 . “Flexible Pub/Sub messaging backed by durable log/stream storage”
5 .
6 .2012: Pulsar idea started at Yahoo! 5 years on production, 100+ applications, 10+ data centers 2016/09 Yahoo open sourced Pulsar 2017/06 Yahoo donated Pulsar to ASF 2018/09 Pulsar graduated as a Top-Level project 2018/09 InfoWorld Best Open Source Project
7 .Pulsar Community
8 .Pulsar Community
9 .Messaging Semantics • At-most once • At-least once • Exactly once
10 .Messaging Semantics • At-most once Before 1.20.0-incubating • At-least once • Exactly once
11 .Messaging Semantics • At-most once • At-least once • Exactly once PIP-6: Guaranteed Message Deduplication
12 .Revisit Existing Semantics
13 .Pulsar’s Existing Semantics send(m1) Log Producer Broker
14 .Pulsar’s Existing Semantics append(m1) Log Producer Broker
15 .Pulsar’s Existing Semantics m1 Log Producer Broker
16 .Pulsar’s Existing Semantics m1 ack(m1) Log Producer Broker
17 .Pulsar’s Existing Semantics m1 ack(m1) Log Producer Broker
18 .Pulsar’s Existing Semantics m1 send(m2) Log Producer Broker
19 .Pulsar’s Existing Semantics m1 m2 append(m2) Log Producer Broker
20 .Pulsar’s Existing Semantics m1 m2 ack(m2) Log Producer Broker
21 .Pulsar’s Existing Semantics What do we do now? m1 m2 ack(m2) Log Producer Broker
22 .At Least Once m1 m2 send(m2) Log Producer Broker
23 .At Least Once m1 m2 m2 append(m2) Log Producer Broker
24 .At Least Once Duplicates !! m2 m2 m1 append(m2) Log Producer Broker
25 .Why the duplicates are introduced? • Broker can fail • The request from Producer to Broker can fail • Producer or Consumer can fail
26 .I want exactly-once
27 .Message Deduplication • Producer: Idempotent Producer • Broker: Guaranteed Message Deduplication (PIP-6) • Consumer: Reader + Checkpoints (Flink / Spark)
28 .Idempotent Producer • Producer Name - Identify who is producing the messages • Sequence ID - Identify the message • Producer Name + Sequence ID: The unique identifier for a message
29 .Guaranteed Message Deduplication • Broker maintains a map between Producer Name and Last- Produced-Sequence-ID • Broker accepts messages if the sequence id of a new message is larger than its last produced sequence id • Broker treats messages whose sequence id are smaller • Broker keeps the map in a de-duplication cursor (stored in bookkeeper)