Pulsar for Kafka People--Jesse Anderson

展开查看详情

1.Pulsar For Kafka People Copyright © 2020 Smoking Hand LLC. All rights Reserved. Version: 82982da9 1 / 25

2. Pulsar For Kafka People Introducing Pulsar Architectures API and Programming Differences Use Cases Copyright © 2020 Smoking Hand LLC. All rights Reserved. Version: 82982da9 2 / 25

3.Apache Pulsar is a distributed event streaming system Apache Pulsar • It uses a distributed log to durably store messages • Pulsar was originally created at Yahoo • Open sourced in 2016 • Graduated to a top-level Apache project in 2018 Copyright © 2020 Smoking Hand LLC. All rights Reserved. Version: 82982da9 3 / 25

4.Kafka is a distributed publish subscribe system Apache Kafka • It uses a commit log to track changes • Kafka was originally created at LinkedIn • Open sourced in 2011 • Graduated to a top-level Apache project in 2012 Copyright © 2020 Smoking Hand LLC. All rights Reserved. Version: 82982da9 4 / 25

5. Pulsar For Kafka People Introducing Pulsar Architectures API and Programming Differences Use Cases Copyright © 2020 Smoking Hand LLC. All rights Reserved. Version: 82982da9 5 / 25

6.Basic Kafka Architecture Publisher Kafka Subscriber Publisher sends data and Subscriber recieves data from doesn't know about the publisher and never directly subscribers or their status. interacts with it. All interactions go through Kafka and it handles all communication. Copyright © 2020 Smoking Hand LLC. All rights Reserved. Version: 82982da9 6 / 25

7.Basic Pulsar Architecture All Brokers in the Pulsar cluster are stateless and can be scaled independently. Producer Pulsar Consumer Producers do not directly Consumers do not directly interact with the BookKeeper interact with the BookKeeper cluster. cluster. All Bookies in the BookKeeper BookKeeper cluster are stateful and can be scaled independently. Copyright © 2020 Smoking Hand LLC. All rights Reserved. Version: 82982da9 7 / 25

8.Kafka Partitions Producer 0 Producer 1 Topic Partition 0 Partition 1 Partition 2 All data is sent and received Data is divided into partitions. on topics. Topics group like Partitions are both logical data together. and physical divisions. Copyright © 2020 Smoking Hand LLC. All rights Reserved. Version: 82982da9 8 / 25

9.Pulsar Ledgers Each topic has it's own stream and all data for a topic is stored in it. Stream Ledger 1 Ledger 2 Ledger 3 As more data is added to a topic, new ledgers are allocated to store the data. Record 1 Record 2 Record 3 Record 4 The individual messages that are produced are stored as records in the ledger. Copyright © 2020 Smoking Hand LLC. All rights Reserved. Version: 82982da9 9 / 25

10.Kafka Consumers Producer 0 Producer 1 Topic Partition 0 Partition 1 Partition 2 Broker 0 Broker 1 Broker 2 Consumer recieves data from Consumer (P012) all topic partitions and connects to brokers 0, 1, and 2. Copyright © 2020 Smoking Hand LLC. All rights Reserved. Version: 82982da9 10 / 25

11.Pulsar Subscriptions Producer 0 Producer 1 Topic Partition 0 Partition 1 Partition 2 Broker 0 Broker 1 Broker 2 Failover Sub. (P012) Shared Sub. (P012) Key Shared (P012) Failover Sub. (P) Shared Sub. (P012) Key Shared (P012) In failover, all partitions are consumed by one consumer and In shared, messages are In key shared, messages with will fail over to hot spare on fail. sent in a round robin way to the same key are consistently all consumers. routed to the same consumer. Copyright © 2020 Smoking Hand LLC. All rights Reserved. Version: 82982da9 11 / 25

12. Pulsar For Kafka People Introducing Pulsar Architectures API and Programming Differences Use Cases Copyright © 2020 Smoking Hand LLC. All rights Reserved. Version: 82982da9 12 / 25

13. Kafka Producer API import org.apache.kafka.clients.producer.KafkaProducer; import org.apache.kafka.clients.producer.ProducerRecord; import static org.apache.kafka.clients.producer.ProducerConfig.*; Properties props = new Properties(); // Configure brokers to connect to props.put(BOOTSTRAP_SERVERS_CONFIG, "broker1:9092"); // Create a producer with the key as a string and value as a string KafkaProducer<String, String> producer = new KafkaProducer<>(props, new StringSerializer(), new StringSerializer()); // Create ProducerRecord and send it String key = "mykey"; String value = "myvalue"; ProducerRecord<String, String> record = new ProducerRecord<>("hello_topic", key, value); producer.send(record); producer.close(); Copyright © 2020 Smoking Hand LLC. All rights Reserved. Version: 82982da9 13 / 25

14. Pulsar Producer API PulsarClient client = PulsarClient.builder() .serviceUrl("pulsar://broker1:6650") .build(); // Create a producer that will send values as strings // Default is byte[] Producer<String> producer = client .newProducer(Schema.STRING) .topic("hellotopic") .create(); // Create a new message, send it, and block until it is // acknowledged producer.newMessage() .key("mykey") .value("myvalue") .send(); // Create a new message, send it, and don't block until it is // acknowledged producer.newMessage() .key("mykey2") .value("myvalue2") .sendAsync(); // Close producer and client producer.close(); client.close(); Copyright © 2020 Smoking Hand LLC. All rights Reserved. Version: 82982da9 14 / 25

15. Kafka Consumer API (1/2) import org.apache.kafka.clients.consumer.ConsumerRecord; import org.apache.kafka.clients.consumer.ConsumerRecords; import org.apache.kafka.clients.consumer.KafkaConsumer; import static org.apache.kafka.clients.consumer.ConsumerConfig.*; @SuppressWarnings("unused") public class HelloConsumer { KafkaConsumer<String, String> consumer; public void createConsumer() { String topic = "hello_topic"; Properties props = new Properties(); // Configure initial location bootstrap servers props.put(BOOTSTRAP_SERVERS_CONFIG, "broker1:9092"); // Configure consumer group props.put(GROUP_ID_CONFIG, "group1"); // Create the consumer with the key as a string and value as a string consumer = new KafkaConsumer<>(props, new StringDeserializer(), new StringDeserializer()); Copyright © 2020 Smoking Hand LLC. All rights Reserved. Version: 82982da9 15 / 25

16. Kafka Consumer API (2/2) consumer.subscribe(Arrays.asList(topic)); while (true) { // Poll for ConsumerRecords for a certain amount of time ConsumerRecords<String, String> records = consumer.poll( Duration.ofMillis(100)); // Process the ConsumerRecords, if any, that came back for (ConsumerRecord<String, String> record : records) { String key = record.key(); String value = record.value(); // Do something with message } } } public void close() { consumer.close(); } public static void main(String[] args) { HelloConsumer consumer = new HelloConsumer(); consumer.createConsumer(); consumer.close(); } Copyright © 2020 Smoking Hand LLC. All rights Reserved. Version: 82982da9 16 / 25

17. Pulsar Consumer API client = PulsarClient.builder() .serviceUrl("pulsar://broker1:6650") .build(); String myTopic = "hellotopic"; String mySubscriptionName = "my-subscription"; // Create a consumer that will receive values as strings // Default is byte[] consumer = client.newConsumer(Schema.STRING) .topic(myTopic) .subscriptionName(mySubscriptionName) .subscribe(); while (true) { // Block and wait until a single message is available Message<String> message = consumer.receive(); try { // Do something with the message System.out.println("Key is \"" + message.getKey() + "\" value is \"" + message.getValue() + "\""); // Acknowledge the message so that it can be // deleted by the message broker consumer.acknowledgeCumulative(message); } catch (Exception e) { // Message failed to process, redeliver later consumer.negativeAcknowledge(message); } } Copyright © 2020 Smoking Hand LLC. All rights Reserved. Version: 82982da9 17 / 25

18.Both projects have an ecosystem associated with them Ecosystem Projects • Kafka Streams -> Pulsar Functions • KSQLDB (prop) -> Pulsar SQL • Kafka Connect -> Pulsar IO • Kafka API compatibility for Pulsar Copyright © 2020 Smoking Hand LLC. All rights Reserved. Version: 82982da9 18 / 25

19. Pulsar For Kafka People Introducing Pulsar Architectures API and Programming Differences Use Cases Copyright © 2020 Smoking Hand LLC. All rights Reserved. Version: 82982da9 19 / 25

20.Kafka++ All Kafka use cases plus more Copyright © 2020 Smoking Hand LLC. All rights Reserved. Version: 82982da9 20 / 25

21.Work Queues http://tiny.bdi.io/workqueues Copyright © 2020 Smoking Hand LLC. All rights Reserved. Version: 82982da9 21 / 25

22.Geo-Replication Built-in geo-replication Copyright © 2020 Smoking Hand LLC. All rights Reserved. Version: 82982da9 22 / 25

23.Unified Do both MQ-style and Pub/Sub-style with the same cluster Copyright © 2020 Smoking Hand LLC. All rights Reserved. Version: 82982da9 23 / 25

24.Lots of Topics Supports millions of topics Copyright © 2020 Smoking Hand LLC. All rights Reserved. Version: 82982da9 24 / 25

25.Thank You bigdatainstitute.io Copyright © 2020 Smoking Hand LLC. All rights Reserved. Version: 82982da9 25 / 25

StreamNative 是一家围绕 Apache Pulsar 和 Apache BookKeeper 打造下一代流数据平台的开源基础软件公司。秉承 Event Streaming 是大数据的未来基石、开源是基础软件的未来这两个理念,专注于开源生态和社区的构建,致力于前沿技术。