- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
Flink在饿了么的应用
展开查看详情
1 .Flink在饿了么的应用与 实践 易伟平 架构师
2 .⽬目录 CONCENTS 1 2 3 平台现状 应⽤用场景 后续规划
3 .01 平台现状
4 . 平台现状架构 Redis Flume Hangout Producer FileBeat UBT MySQL Broker KaAa DRC Producer Binlog Broker MaxQ Applica3on Producer log Broker ES InfluxDB
5 .平台现状数据 60 TB/天 1 ,000,000,000 400 节点 数据量 计算量 集群
6 .02 应⽤用场景
7 .应用场景 一致性语义 • 重要概念 Ø at-most-once fire and forget at-least-once + Ø at-least-once idempotent = 重发机制 exactly-once Ø exactly-once Checkpoint 粒度控制
8 .应⽤用场景 STORM 1. Tuple-Based 2. 毫秒级延迟 3. Java 4. Typhon + Flux
9 .应⽤用场景 STORM Cons 1. 易易⽤用性 SQL&DSL 2. StateBackend 3. 资源分配 4. 吞吐
10 .应⽤用场景 SPARK STREAMING 1. Micro-batch 2. 秒级延迟 3. Java/Scala 4. Streaming SQL
11 .应⽤用场景 SPARK STREAMING Pros 1. Spark Ecosystem pros& Spark SQL 2. Checkpoint on HDFS 3. On Yarn 4. Throughput is High
12 .应⽤用场景 SPARK STREAMING
13 .应⽤用场景 SPARK STREAMING
14 .应⽤用场景 MULTI-STREAM JOIN Spark1.5.x hIps://github.com/Intel-bigdata/spark-streamingsql
15 .应⽤用场景 MULTI-STREAM JOIN Tricky Way Table A Topic A Topic B Split By Condi3ons Join RDD DF Table B
16 .应⽤用场景 MULTI-STREAM JOIN
17 .应用场景 EXACTLY-ONCE 1.从zk读取最近Kafka分区对应 的offsets 5.Commit Offsets In Transaction 2.创建Kafka stream 4.Upsert By Unique Key Driver Executor 3.消费数据
18 .应⽤用场景 SPARK STREAMING Cons 1. Stateful Processing SQL ( <2.x mapWithState、updateStateByKey) 2. Real Multi-Stream Join 3. End-To-End Exactly-Once Semantics
19 .应⽤用场景 STRUCTURED STREAMING
20 .应⽤用场景 STRUCTURED STREAMING
21 .应⽤用场景 STRUCTURED STREAMING Pros 1. Stateful Processing SQL&DSL 2. Real Multi-Stream Join 3. Easy to Ensure End-To-End Exactly-Once Semantics
22 .应⽤用场景 STRUCTURED STREAMING SELECT action, WINDOW(time, "10 minutes"), COUNT(*) val windowedCounts = actions FROM events .withWatermark("time", "10 minutes") GROUP BY action, WINDOW(time, “10 minutes”) .groupBy( watermark(”time”,” 10 minutes”) $"action" window($”time", "10 minutes", "5 minutes"), ) .count()
23 .应⽤用场景 STRUCTURED STREAMING Cons 1. Trigger (Processing Time、 Continuous ) 2. Continuous Processing (Only Map-Like Operations) 3. Low End-To-End Latency With Exactly-Once Guarantees 4. CEP (Drools)
24 .应⽤用场景 FLINK
25 .应⽤用场景 FLINK
26 .应⽤用场景 FLINK Binary Data Operator
27 .应⽤用场景 FLINK Task & Operator Chain
28 .应⽤用场景 FLINK Parallelism Scale-out
29 .应⽤用场景 FLINK State & Checkpoint