Production-ready stream processing with Apache Fli

来在Data Artisans公司的工程师,介绍dA平台,data Artisan公司内部一个准备上生产环境的基于Apache Flink的流数据处理平台。这个平台包括了集中管理和部署模块,再Kubernetes上调度有状态的Flink程序,并且能够合理维护这些程序运行和他们的状态。
展开查看详情

1.dA Platform Production-ready stream processing with Apache Flink® Robert Metzger Patrick Lucas

2.WHY WE BUILD DA PLATFORM The need for The right tools from Operations for stateful streaming platforms day one streaming applications 2 © 2018 data Artisans

3.THE NEED FOR STREAMING PLATFORMS Integration with Focus on use-cases internal infrastructure instead of infrastructure • Logging • Data access • Metrics • Security • State storage • Auditing Integrate once, scale adoption DevOps model within organization 3 © 2018 data Artisans

4.THE RIGHT TOOLS FROM DAY ONE 4 © 2018 data Artisans

5.STATEFUL STREAMING OPERATIONS KUBERNETES AND THE STATEFUL STREAM WORLD OF CONTAINERS PROCESSING Kubernetes offers easy to use Stateful stream processing container orchestration. But is disrupting analytics and containers are stateless. application development. Managing a stateful stream processor on stateless containers is challenging. 5 © 2018 data Artisans

6.STATEFUL STREAMING OPERATIONS KUBERNETES AND THE STATEFUL STREAM WORLD OF CONTAINERS PROCESSING Kubernetes offers easy to use Stateful stream processing container orchestration. But is disrupting analytics and containers are stateless. application development. 6 © 2018 data Artisans

7.WHY WE BUILD DA PLATFORM The need for The right tools from Operations for stateful streaming platforms day one streaming applications Easy integration and Get started with Bringing stateful focus on use-cases stream processing in streaming into the world no time with a turnkey of containers solution 7 © 2018 data Artisans

8. HOW DOES IT WORK? 8 © 2018 data Artisans

9.DECLARATIVE CONTROL OF DEPLOYMENTS Specify your Flink streaming deployment, dA Platform takes care of configuring, deploying and operating it. Deployment spec state: running job: fraud-detection.jar parallelism: 60 flinkConfiguration: {…} cpu: 8 … Log messages Status report / from Flink Event Log Flink User System Kubernetes Interface Metrics 9 © 2018 data Artisans

10.DECLARATIVE CONTROL IN ACTION Changing the specification of a deployment will be reflected in the cluster. Deployment spec state: cancelled job: fraud-detection.jar parallelism: 60 flinkConfiguration: {…} cpu: 8 … In this case, the specification of a deployment changed to “cancelled”. The Flink cluster in Kubernetes will be deallocated. Kubernetes 10 © 2018 data Artisans

11.DECLARATIVE CONTROL IN ACTION Changing the specification of a deployment will be reflected in the cluster. Deployment spec state: running job: fraud-detection.jar parallelism: 60 -> 80 flinkConfiguration: {…} Scale up cluster cpu: 8 … The Flink cluster will scale up to the specified size. Kubernetes 11 © 2018 data Artisans

12.MANAGING STREAMING APPS AND STATE dA Platform allows to perform stateful deployment changes. Deployment changes: • Upgrading the Flink job Scale up cluster • Upgrading Flink • Changing the parallelism • Changing resource allocations • Changing a configuration parameter • Suspending / resuming a deployment • … Migrate state Migrate State Kubernetes 12 © 2018 data Artisans

13.VERSIONED APPLICATIONS, NOT JOBS/JARS STREAM PROCESSING NEW APPLICATION APPLICATION Version 3 Version 3a upgrade Version 2 Version 2a fork / upgrade duplicate Version 1 Code and Application State 13 © 2018 data Artisans

14. IMPLEMENTATION 14 © 2018 data Artisans

15.ARCHITECTURE Real-time Anomaly- & Real-time Data Reactive Analytics Fraud Detection Integration Microservices dA Logging Application Manager Application lifecycle Apache Flink management Stateful stream processing Metrics Kubernetes Container platform 15 © 2018 data Artisans

16.ARCHITECTURE All components of dA Platform are shipped as Docker containers We provide an installer for setting up the platform on Kubernetes The containers for Flink follow the same versioning scheme as the Apache Flink releases. The dAP distribution Flink has minor patches: compatible with the open source releases. Metrics and logging components are provided for demonstration purposes. We recommend integration with existing systems. 16 © 2018 data Artisans

17.APPLICATION MANAGER Application Manager is the central orchestration and lifecycle management component of dA platform. CI/CD Integration REST API Application Job Control Manager Resource Allocation Web interface Resource Management 17 © 2018 data Artisans

18.APPLICATION MANAGER Deployment of Flink clusters Keep track of Flink jobs, on Kubernetes configuration parameters, savepoints, events Stored on a persistent volume Control Flink jobs Declarative REST API Generate configuration for metrics, For integration with existing logging, HA, state backups, systems such as CI/CD pipelines security, … Trigger savepoints, cancel jobs, start jobs with savepoints, check Flink health, … Rich, interactive user interface Based on the main REST API 18 © 2018 data Artisans

19.HOOKS FOR CI/CD PIPELINES Stateful Upgrade Push Trigger Call Update CI API dA Application Application Application Manager Version 1 Version 2 CI Service Kubernetes Cluster 19 © 2018 data Artisans

20.DEMO 20 © 2018 data Artisans

21. WRAP-UP 21 © 2018 data Artisans

22.AVAILABILITY • dA Platform has been announced at Flink Forward Berlin 2017, with a closed beta program ‒ 190 beta signups, successful collaboration with a large number of beta users dA Platform Trial VM • As of March 2018, dA Platform is generally Download Now available • Trial versions are available for download: data-artisans.com/download ‒ Recommendation: Download the trial virtual machine! • dA Platform license includes Enterprise Support for Apache Flink 22 © 2018 data Artisans

23.ROADMAP Improved Access Control Metadata management Introduce ownership for resources Awareness for data sources such as Kafka (deployments, savepoints, jobs), control topics, files in S3 or HDFS. access based on ownership. SQL Multi-Datacenter fail-over SQL will be a first class citizen of dA Platform Run dA Platform in redundant data centers, with interfaces for interactive queries and automatically moving the processing to a integration with existing SQL tools. standby data center in case of a failure. 23 © 2018 data Artisans

24.SUMMARY It solves common challenges dA Platform is a turnkey such as job upgrades, solution for stateful stream configuration changes or processing with Apache Flink. state migrations.. Logging dA Application Manager Application lifecycle management Apache Flink Stateful stream processing Metrics Kubernetes Container platform 24 © 2018 data Artisans

25. Q&A 25 © 2018 data Artisans

26. data-artisans.com data-artisans.com/blog STAY IN TOUCH @dataArtisans data-artisans data Artisans 26 © 2018 data Artisans