Pulsar IO 运行原理与开发实践-guangning

展开查看详情

1.Pulsar IO 运⾏原理与开发实践 2020/01/04 SPEAKER ⼴宁

2.Who am I • Java && Python programmer • Creditease > Meituan > StreamNative • Pulsar committer ◦ Pulsar IO ◦ Pulsar Manager ◦… • https://github.com/tuteng

3.1.Function Worker 组件介绍 2.由 Function 演化到 Pulsar IO 3.Pulsar IO 介绍 4.连接 Pulsar 与 IoTDB

4.Function Worker 中的 各个组件

5.Function Worker

6.准备知识-订阅模式

7.准备知识-⽣产消费

8.Membership Leader Function Worker1 Function Worker2 Function Worker3 participant participant participant Failover Failover Failover coordinate Topic

9.Membership Manager 订阅到 coordinate Topic 使⽤ Failover 的订阅模式 选举 leader Consumer name 由 workerId, workerHostname, workerPort 组成

10.Metadata Manager Leader Function Worker1 Function Worker2 Function Worker3 reader reader reader Exclusive Exclusive Exclusive metadata Producer Topic User request

11.Metadata Manager create start Source get update list Function stop restart Sink status delete

12.Metadata Manager 初始化 Reader 设置订阅 metadata 这个 topic 设置从 earliest 开始消费 ⽤户的请求通过 Producer 被发出来

13.Scheduler Manager assignmen t Exclusive Exclusive Exclusive reader producer reader producer reader producer Function Worker1 Function Worker2 Function Worker3 Leader

14.Scheduler Manager 初始化 Producer 初始化消费者并订阅到 assignment Topic 只有 leader 可以进⾏调度 使⽤ RoundRobin 的调度算法

15.Runtime Manager Function Worker Process Golang Thread Python reader Kubernetes Java Runtime Instance assignmen t Topic

16.Function Election Worker Failover Failover coordinate Scheduler Exclusive Exclusive Function Worker1 assignmen Function Worker2 t consumer reader reader consumer Leader reader producer producer reader Exclusive Exclusive metadata Producer User request

17.Instance 接收输⼊ def process(input): 计算、处理 return "{}!".format(input) 输出

18.Function Input Compute Output Consumer Compute Producer

19.Function 使⽤ Consumer 从 Topic 中接收输⼊ 对数据进⾏计算 使⽤ Producer 将数据输出到 Topic 通过 Topic 连接多个 Function Consumer Comput Producer e Topic

20.Function

21. From Function to Source 将 Function 的 Consumer 去掉 开始接收外部系统的数据 经过计算,再将数据输出到 Pulsar 的 Topic External System Comput Topic Producer e Source

22.From Function to Sink 将 Function 的 Producer 去掉 开始接收外部系统的数据 经过计算,再将数据输出到外部系统中 Comput Topic Consumer External System e Sink

23.Source、Sink 数据的输⼊ 数据的输出

24.Pulsar IO 概念 Source、Sink、Connector PulsarSource、PulsarSink、IdentityFunction Consumer 与 Producer

25.Pulsar IO 概念 Source Sink Consumer Connector PulsarSource IdentityFunction IdentityFunction PulsarSink Connector Producer

26.Source 的初始化 组件以相反的顺序进⾏初始化 ⾃定义 Connector 组件根据类名称,通过反射加载 系统默认类 PulsarSink 是个 Producer PulsarSink IdentityFunction` Connector

27.Source 中数据的流动 从外部系统接收数据 流经 Function 到达 PulsarSink, 被发送到 Topic Connector IdentityFunction PulsarSink Queue

28.Sink 的初始化 组件以同样以相反的顺序进⾏初始化 ⾃定义 Connector 组件根据类名称,通过反射加载 系统默认类 PulsarSource 是个 Consumer Connector IdentityFunction PulsarSource

29.Sink 中数据的流动 PulsarSource 从 Topic 接收数据 数据流经 Function 到达⽤户⾃定义 Connector,输出到外部系统 PulsarSource IdentityFunction Connector Queue

StreamNative 是一家围绕 Apache Pulsar 和 Apache BookKeeper 打造下一代流数据平台的开源基础软件公司。秉承 Event Streaming 是大数据的未来基石、开源是基础软件的未来这两个理念,专注于开源生态和社区的构建,致力于前沿技术。
关注他