基于平台的发展现状,我们对比分析出各个领域在我们平台的应用状况,我们选择拥抱flink。

注脚

展开查看详情

1.Flink在饿了么的应用与 实践 易伟平 架构师

2.⽬目录 CONCENTS 1 2 3 平台现状 应⽤用场景 后续规划

3.01 平台现状

4. 平台现状架构 Redis Flume Hangout Producer FileBeat UBT MySQL Broker KaAa DRC Producer Binlog Broker MaxQ Applica3on Producer log Broker ES InfluxDB

5.平台现状数据 60 TB/天 1 ,000,000,000 400 节点 数据量 计算量 集群

6.02 应⽤用场景

7.应用场景 一致性语义 •  重要概念 Ø  at-most-once fire and forget at-least-once + Ø  at-least-once idempotent = 重发机制 exactly-once Ø  exactly-once Checkpoint 粒度控制

8.应⽤用场景 STORM 1.  Tuple-Based 2.  毫秒级延迟 3.  Java 4.  Typhon + Flux

9.应⽤用场景 STORM Cons 1.  易易⽤用性 SQL&DSL 2.  StateBackend 3.  资源分配 4.  吞吐

10.应⽤用场景 SPARK STREAMING 1.  Micro-batch 2.  秒级延迟 3.  Java/Scala 4.  Streaming SQL

11.应⽤用场景 SPARK STREAMING Pros 1.  Spark Ecosystem pros& Spark SQL 2.  Checkpoint on HDFS 3.  On Yarn 4.  Throughput is High

12.应⽤用场景 SPARK STREAMING

13.应⽤用场景 SPARK STREAMING

14.应⽤用场景 MULTI-STREAM JOIN Spark1.5.x hIps://github.com/Intel-bigdata/spark-streamingsql

15.应⽤用场景 MULTI-STREAM JOIN Tricky Way Table A Topic A Topic B Split By Condi3ons Join RDD DF Table B

16.应⽤用场景 MULTI-STREAM JOIN

17.应用场景 EXACTLY-ONCE 1.从zk读取最近Kafka分区对应 的offsets 5.Commit Offsets In Transaction 2.创建Kafka stream 4.Upsert By Unique Key Driver Executor 3.消费数据

18.应⽤用场景 SPARK STREAMING Cons 1.  Stateful Processing SQL ( <2.x mapWithState、updateStateByKey) 2.  Real Multi-Stream Join 3.  End-To-End Exactly-Once Semantics

19.应⽤用场景 STRUCTURED STREAMING

20.应⽤用场景 STRUCTURED STREAMING

21.应⽤用场景 STRUCTURED STREAMING Pros 1.  Stateful Processing SQL&DSL 2.  Real Multi-Stream Join 3.  Easy to Ensure End-To-End Exactly-Once Semantics

22.应⽤用场景 STRUCTURED STREAMING SELECT action, WINDOW(time, "10 minutes"), COUNT(*) val windowedCounts = actions FROM events .withWatermark("time", "10 minutes") GROUP BY action, WINDOW(time, “10 minutes”) .groupBy( watermark(”time”,” 10 minutes”) $"action" window($”time", "10 minutes", "5 minutes"), ) .count()

23.应⽤用场景 STRUCTURED STREAMING Cons 1.  Trigger (Processing Time、 Continuous ) 2.  Continuous Processing (Only Map-Like Operations) 3.  Low End-To-End Latency With Exactly-Once Guarantees 4.  CEP (Drools)

24.应⽤用场景 FLINK

25.应⽤用场景 FLINK

26.应⽤用场景 FLINK Binary Data Operator

27.应⽤用场景 FLINK Task & Operator Chain

28.应⽤用场景 FLINK Parallelism Scale-out

29.应⽤用场景 FLINK State & Checkpoint

user picture
Apache Flink China中文社区,致力于Flink技术在中国的推广与传播。

相关文档