Zhaopin in Pulsar community

从智联招聘开始调研 Apache Pulsar 至今已经有一年的时间,在这一年的时间里,Apache Pulsar 为智联提供了稳定的消息系统服务,承载了日均百亿级别的消息量。

智联招聘积极参与 Pulsar 社区讨论及开发,贡献了多个新特性,包括 Key_Shared 订阅模式、HDFS Offloader 以及诸多围绕 Pulsar Schema 以及 Presto SQL 相关的改进。

本次演讲,李鹏辉将分享智联招聘参与 Pulsar 社区的经验和收获,以及参与 Pulsar 2.4.0 版本发布的感受。此外,李鹏辉和丛搏一起分享智联招聘为 Pulsar 贡献的特性及其使用和实现原理。

展开查看详情

1.

2. Zhaopin in Pulsar community PHOTO Penghui Li 李李鹏辉 Messaging platform leader in zhaopin.com Apache Pulsar Committer

3.Our team Penghui Li Bo Cong

4.Apache Pulsar in zhaopin.com First service for online 1 billion / day 6 billion / day 20 billion / day applications 2018/08 2018/10 2019/02 2019/08 50+ Namespaces 5000+ Topics

5.1. Features of zhaopin contributing to the community 2. Details of Key_shared subscription 3. Release Pulsar 4. Details of Pulsar multiple schema version 5. Details of HDFS Offloader

6.Dead letter topic … 2 1 0 1 Topic Consumer Topic-DLQ message 1 process failed so many times

7.Client interceptors Send Receive Producer Topic Consumer Send Ack Acknowledge

8.Time partitioned un-ack message tracker Add messages to tracker Send redelivery request p-4 p-3 p-2 p-1 p-0 Current partition Timeout partition

9.Message redelivery optimization 8 0 Broker 7 6 5 4 3 2 0 1 Consumer Consumer internal queue 0 1 2 3 0 4 5 6 7

10.Key_shared subscription Consumer D-1 <k 1, Producer 1 v0 > <k 1, v4 > <k3,v2> Pulsar Subscription Consumer D-2 topic <k 2,v <k1,v4> <k2,v3> <k3,v2> <k2,v1> <k1,v0> 3 > <k Producer 2 2 , v1 > Consumer D-3 A new subscription mode in 2.4.0

11.Start with Key_shared subscription Consumer consumer = client.newConsumer() .topic(“my-topic”) .subscriptionName(“my-sub”) .subscriptionType(SubscriptionType.Key_Shared) .subscribe()

12.How Key_shared subscription works 0 1 2 3 4 5 6 7 8 9 Consumer-1 0 65536 Sticky key dispatcher(auto split hash range)

13.How Key_shared subscription works 0 1 2 3 4 5 6 7 8 9 Consumer-2 Consumer-1 0 65536 Sticky key dispatcher(auto split hash range)

14.How Key_shared subscription works 0 1 2 3 4 5 6 7 8 9 Consumer-2 Consumer-3 Consumer-1 0 65536 Sticky key dispatcher(auto split hash range)

15.How Key_shared subscription works 0 1 2 3 4 5 6 7 8 9 Consumer-4 Consumer-1 0 65536 Sticky key dispatcher(auto split hash range)

16.How Key_shared subscription works 0 1 2 3 4 5 6 7 8 9 Consumer-4 Consumer-3 Consumer-1 0 65536 Sticky key dispatcher(auto split hash range)

17.Key-based message batcher <k2,v1> <k4,v0> <k6,v1> <k2,v0> <k3,v1> <k6,v0> <k1,v1> <k3,v0> <k5,v0> p-0 p-1 p-2

18.Key-based message batcher <k2,v1> <k4,v0> <k6,v1> <k2,v0> <k6,v0> <k3,v1> <k1,v1> <k3,v0> <k5,v0> p-0 p-1 p-2

19.Use Key-based message batcher Consumer consumer = client.newConsumer() .topic(“my-topic”) .subscriptionName(“my-sub”) .subscriptionType(SubscriptionType.Key_Shared) .batcherBuilder(BatcherBuilder.KEY_BASED) .subscribe()

20.Pulsar SQL improvements ✓ Namespace delimiter rewriter ✓ Partition as internal column ✓ Primitive schema handle ➡ Multiple version schemas handle

21.Some other improvements ✓ Service URL provider ✓ Consumer reconnect limiter ➡ Messages batch receive

22.Next ★ Topic level policy ★ Sticky consumer

23.2.4.0 Release 1. New branch and tag 2. Stage release (check -> sign -> stage) 3. Move master to new version and write release notes 4. Start vote 5. Promote release and publish 6. Update site and announce the release

24. Schema versioning & HDFS offloader PHOTO Bo Cong 丛搏 Message platform engineer in zhaopin.com Apache Pulsar contributor

25.The meaning of multi-version schema Message's schema is not immutable message 1 message 2 message 3 message 4 message 5 schema version 0 version 1 version 2 version 3 version 4

26.Problems caused by version changes Class Person { Class Person { Int id; Can read Int id; @AvroDefault("\"Zhang San\"") } String name; Version 0 } Version 1 Can’t read Can read Class Person { Int id; String name; } Version 2

27.Change in compatibility policy can read can read Back Ward version 2 version 1 version 0 can read can read Back Ward Transitive version 2 version 1 version 0 can read

28.Schema creation process Incompatible admin client api compatibility check new schema version admin rest api schema data SchemaRegistryService producer create version old schema consumer subscribe

29.Multi-version use in pulsar Avro schema message 1 message 2 message 3 version 0 version 1 version 2 version 3 consumer