- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
Transaction support in Pulsar
目前 Pulsar 通过 Idempotent Producer 支持在单个 partition 中的 exactly-once 语义。Idempotent Producer 保证 Producer 发送的消息在不丢失的情况下仅会被持久化一次。但是,当 Producer 发送消息到多个 partition 时,不能保证消息发送的原子性 。同样,Pulsar Functions 在处理多个事件或者输出一组结果到不同 topic partition 时,不能保证计算的原子性。PIP-31 通过增加对事务的支持来解决上述场景面临的问题。
本次演讲,郭斯杰和张勇将会详细解析 Pulsar 2.5.0 版本中支持的事务功能。
展开查看详情
1 .
2 .Transaction Support in Pulsar Sijie Guo Apache Pulsar / BookKeeper PMC Member Yong Zhang Apache Pulsar Contributor
3 .Messaging Semantics • At-most once • At-least once • Exactly once
4 .Messaging Semantics • At-most once Before 1.20.0-incubating • At-least once • Exactly once
5 .Messaging Semantics • At-most once • At-least once • Exactly once PIP-6: Guaranteed Message Deduplication
6 .Revisit Existing Semantics
7 .Pulsar’s Existing Semantics send(m1) Log Producer Broker
8 .Pulsar’s Existing Semantics append(m1) Log Producer Broker
9 .Pulsar’s Existing Semantics m1 Log Producer Broker
10 .Pulsar’s Existing Semantics m1 ack(m1) Log Producer Broker
11 .Pulsar’s Existing Semantics m1 ack(m1) Log Producer Broker
12 .Pulsar’s Existing Semantics m1 send(m2) Log Producer Broker
13 .Pulsar’s Existing Semantics m1 m2 append(m2) Log Producer Broker
14 .Pulsar’s Existing Semantics m1 m2 ack(m2) Log Producer Broker
15 .Pulsar’s Existing Semantics What do we do now? m1 m2 ack(m2) Log Producer Broker
16 .At Least Once m1 m2 send(m2) Log Producer Broker
17 .At Least Once m1 m2 m2 append(m2) Log Producer Broker
18 .At Least Once Duplicates !! m2 m2 m1 append(m2) Log Producer Broker
19 .Why the duplicates are introduced? • Broker can fail • The request from Producer to Broker can fail • Producer or Consumer can fail
20 .I want exactly-once
21 .Message Deduplication • Producer: Idempotent Producer • Broker: Guaranteed Message Deduplication (PIP-6) • Consumer: Reader + Checkpoints (Flink / Spark)
22 .Idempotent Producer • Producer Name - Identify who is producing the messages • Sequence ID - Identify the message • Producer Name + Sequence ID: The unique identifier for a message
23 .Guaranteed Message Deduplication • Broker maintains a map between Producer Name and Last- Produced-Sequence-ID • Broker accepts messages if the sequence id of a new message is larger than its last produced sequence id • Broker treats messages whose sequence id are smaller • Broker keeps the map in a de-duplication cursor (stored in bookkeeper)
24 .Exactly Once send(1, m1) Log Producer Broker
25 .Exactly Once 1, m1 append(1, m1) Log Producer Broker
26 .Exactly Once 2, m2 1, m1 append(2, m2) Log Producer Broker
27 .Exactly Once What do we do now? 2, m2 1, m1 ack(2, m2) Log Producer Broker
28 .Exactly Once 2, m2 1, m1 send(2, m2) Log Producer Broker
29 .Exactly Once 2, m2 1, m1 append(2, m2) Log Producer Broker