- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
What turns stream processing from a tool into a platform?
展开查看详情
1 .STREAM PROCESSING FROM APPLICATIONS TO PLATFORMS - STEPHAN EWEN, CO-FOUNDER & CTO
2 .A platform makes building new applications simple by taking care of the common and repeatable parts. 2
3 .Internal streaming data platforms built with Apache Flink 3
4 . Observation 1 Stream Processing is about building applications 4
5 .Batch / Data Lake Architecture a.k.a. collect now, figure out later 5
6 .Streaming / Data-driven Applications build applications directly on data streams 6
7 . Observation 2 Stream Processing changes the database-centric architecture 7
8 .Recall last Flink Forward… Classic tiered architecture Streaming architecture compute compute + layer application state database stream storage layer and snapshot storage (backup) application working state + historic state 8
9 .Changing the Two Tier Architecture Classic tiered architecture Streaming architecture all modifications are local reads/writes across tier boundary asynchronous writes of large blobs 9
10 .Application Platforms 10
11 .Application Platforms Logging Metrics Resource Manager CI / CD 11
12 .Kubernetes deploying new scaling applications applications Kubernetes 12
13 .Kubernetes & Stateful Applications Database Kubernetes 13
14 .What about stateful containers? • Example: Scaling down a replicated database • 3 replicas, 4 node scale down need to move or reorganize data before container shutdown Kubernetes 14
15 .Stateful Questions consistent stateful upgrades • application evolution and bug fixes migration of application state • cluster migration, A/B testing A B re-processing and reinstatement • fix corrupt results, bootstrap new applications state evolution (schema evolution) 15
16 . Kubernetes Apache Flink Container-based Stateful Stream Resource Orchestration Processing & Snapshots Application dA Platform Manager Code, Resource, Config, and Container-based Snapshot Management platform for stateful data-driven applications 16
17 .Web CI/CD interface Job Control App Snapshot Management Manager Resource Allocation Kubernetes Storage
18 . Versioned Applications, not Jobs/Jars Stream Processing Application New Application Version 3 Version 3a upgrade Version 2 Version 2a fork / upgrade duplicate Version 1 Code and Application Snapshot 18
19 .Architecture dA Application Logging Manager Application Apache Flink lifecycle Stateful stream processing Metrics management Kubernetes Container platform
20 .What could the future of a Streaming Data Platform look like? 20
21 .The Usual Suspects Role-based access control Metadata management Cross Datacenter Failover / Disaster Recovery 21
22 .Support for Batch Processing Everything is a stream. Finite applications as a special case. 22
23 .Periodic Bursty Stream Processing Bursty Event Stream (events only at end-of-day ) time Checkpoint / Savepoint Store 23
24 .Support a Broad Developer Audience … Streaming Data Platform 24
25 .Use Case Vertical Libraries Machine SQL CEP … Learning Streaming Data Platform 25
26 .dA Platform is a turnkey solution for stateful stream processing with Apache Flink. dA Logging Application Manager Apache Flink Application lifecycle Stateful stream processing Metrics management Kubernetes Container platform