Building a Messaging Solutions for OVHcloud with Apache Pulsar——Pierre Zemb


1.Building a Messaging Solutions for OVHcloud with Apache Pulsar Pierre Zemb Technical Leader Pulsar Summit 2020

2.$ whoami ● Pierre Zemb (@PierreZ) ● Technical Leader ● Working around distributed systems ● Apache contributor ○ HBase, Flink, Pulsar Involved into local dev communities 2

3.Schedule 1. What is OVHcloud? 2. The need of a Messaging Solutions 3. The choice of Apache Pulsar 4. Overview of our infrastructure 5. Overview of our management layer 6. The quest to support Apache Kafka 7. Our ideas for the future 3

4.OVHcloud, a Global Cloud Provider ● 30 data centers globally ● Our own high-quality global network, committed to the highest security standards ● NSX and vRack Secure your platform with micro- segmentation of private L2 that spans global data centers ● SSL Gateway Service: Up to 10,000 concurrent connections. Optional Anycast DNS service. ● Highest compliance and certification standards ● Anti-DDoS: Highly resilient Layer 4-7 DDoS protection built into the network

5.Providing a platform Compute

6.Providing a platform Compute messaging?

7.Let’s build a messaging solution! Been there... ● OVHcloud started a beta called “Queue as a service” in 2015 ● Based on Apache Kafka ● Multi-tenant cluster ● Beta closed in 2018 ● Massively used internally

8.What we learn from Queue As a Service From users: ● Users wants not only Kafka, but queing as well: ○ RabbitMQ ○ MQTT ○ ... ● They want to support old versions of Kafka’s protocol ● Data encryption?

9.What we learn from Apache Kafka From us: ● No built-in {multi-tenancy, geo-replication} ● Creating a topic is not cost-free ● infinite retention isn't possible ● no tiered storage ● Operations are not very convenient ○ we cannot "just" scale storage ○ a consumer reading old data can slow down the whole broker

10.What we learn from Apache Kafka Disclaimer: ● We ♥ Apache Kada ● We have far more messages in Kada than Pulsar within OVHcloud ● For certain use cases, we need an alternaeve

11.Let’s build a messaging solution! Messaging solueon What we are exposing to customers Pulsar Kafka RabbitMQ ... What we choose as an infrastructure Messaging system provider

12.Let’s build a messaging solution! Requirements for the foundation of a messaging solution: ● has multi-tenancy ● can be used for queuing and streaming ● can be easily extend ● has lower operational cost at scale


14. Apache Pulsar’s TL;DR ❏ What Pulsar Provides ✓ Mul$-Tenancy ✓ Security ✓ TLS Encryp$on ✓ Authen$ca$on, Authoriza$on ✓ Geo-replica$on ✓ Queuing and streaming seman$cs ✓ Tiered storage ✓ Schema ✓ Integra$ons with big data ecosystem (Flink / Spark / Presto)

15.Let's deploy Apache Pulsar! 🚀

16. Our deployment TODO drawing: remove producer Add pulsar-proxy above pulsar-broker Add haproxy above pulsar-proxy

17.Bookkeeper's tuning ● Enabled ○ Z Garbage Collector, also known as ZGC ○ Prometheus exporter ● configured: ○ multiple journalDirectory to better exploit SSD throughput ○ one ledgerDirectory per HDD

18.Pulsar's configuration ● Started with ○ 3 bookies to use when creating a ledger (ensemble) ○ 3 copies to store for each message (writeQuorum) ○ 2 guaranteed copies (ackQuorum) ● Now running 4/2/2 layouts ○ Increase striped writes Lesson learned: avoid having the ensemble equals to the number of bookies

19.Some benchmark! Sending a small string as value as fast as we can from 8 VMs to two partitions 1.8 millions of msg/s/partitions

20.Some benchmark! Bookkeeper outage

21.Lesson learned: learn Bookkeeper's CLI

22.Lesson learned: learn Bookkeeper's CLI

23.Meet Bookkeeper's friend: the Auditor

24.Meet Bookkeeper's friend: the Auditor

25.Meet Bookkeeper's friend: the Auditor

26.Let's manage Apache Pulsar! 🚀

27. Our management layer ● create topic Sync ● create tokens Management ● set retention µservice ● ...

28. Our management layer ● Written in Go ● Cluster-aware ● Push topic's configuration to clusters ● Pull topic's usage from clusters ● Generate valid JWT's token

29. Our management layer ● WriFen in Go ● Cluster-aware ● Push topic's configuraHon to clusters ● Pull topic's usage from clusters ● Generate valid JWT's token Lessons learned: Pulling topics usage is costly, we should report them to management (PIP?)

StreamNative 是一家围绕 Apache Pulsar 和 Apache BookKeeper 打造下一代流数据平台的开源基础软件公司。秉承 Event Streaming 是大数据的未来基石、开源是基础软件的未来这两个理念,专注于开源生态和社区的构建,致力于前沿技术。