Scaling Uber’s Realtime Optimization with Apache Flink

Uber engineers leverage Apache Flink to build a platform that not only runs compute intensive optimization models, but also very quickly reacts to rapid changes in marketplace. In this talk, I will cover the compute platform that leverages Apache Flink to i.) aggregate billions of realtime and forecasted demand and supply level information across the globe. ii.) trigger on-demand optimization models to respond to changes in marketplace and iii.) scale both horizontally and vertically as we expand the platform to onboard new applications and experiences.
展开查看详情

1.Scaling Uber’s Real-time Optimization with Flink Xingzhong Xu Engineer | Uber Marketplace xxu@uber.com Apr 10, 2018

2.Scaling Uber’s Real-time Optimization with Flink Uber's mission is to bring transportation — for everyone, everywhere. — — — Uber 2018 Prepare for Flink Forward 2018 Version 1.0

3.Scaling Uber’s Real-time Optimization with Flink Agenda — ● Uber Marketplace ● Geo/temporal event aggregation ● Online model update ● Streaming application — — — Uber 2018 Prepare for Flink Forward 2018 Version 1.0

4.Scaling Uber’s Real-time Optimization with Flink Uber Marketplace — — — Uber 2018 Prepare for Flink Forward 2018 Version 1.0

5.Scaling Uber’s Real-time Optimization with Flink Uber marketplace — Dynamic logistics network and decision engines at your fingertips. — — — Uber 2018 Prepare for Flink Forward 2018 Version 1.0

6.Scaling Uber’s Real-time Optimization with Flink Marketplace dynamics — ● Supply ● Demand ● Forecast ● Trips ● Traffic — — — Uber 2018 Prepare for Flink Forward 2018 Version 1.0

7.Scaling Uber’s Real-time Optimization with Flink Marketplace decision engines — ● Dispatch ● Pricing ● Driver Positioning ● Promotions — — — Uber 2018 Prepare for Flink Forward 2018 Version 1.0

8.Scaling Uber’s Real-time Optimization with Flink Uber Marketplace — — — — Uber 2018 Prepare for Flink Forward 2018 Version 1.0

9.Scaling Uber’s Real-time Optimization with Flink Geo-temporal event aggregation — — — — Uber 2018 Prepare for Flink Forward 2018 Version 1.0

10.Scaling Uber’s Real-time Optimization with Flink Events in physical world drive marketplace dynamics Every second, marketplace ingesting millions events real-time real-world — — — Uber 2018 Prepare for Flink Forward 2018 Version 1.0

11. Scaling Uber’s Real-time Optimization with Flink Real-time challenges ● Event time ordering ● Time sensitive — — — Uber 2018 Prepare for Flink Forward 2018 Version 1.0

12.Scaling Uber’s Real-time Optimization with Flink Aggregate in real-time — ● Aggregation in windows bucket ● Results over event time ● As soon as possible ● As accurate as possible — — — Uber 2018 Prepare for Flink Forward 2018 Version 1.0

13. Scaling Uber’s Real-time Optimization with Flink Real-world challenges ● Event spatial mapping ● Locality sensitive — — — Uber 2018 Prepare for Flink Forward 2018 Version 1.0

14.Scaling Uber’s Real-time Optimization with Flink Aggregation in real-world — ● Influences its current and neighbours ● Apply geo func on related events — — — Uber 2018 Prepare for Flink Forward 2018 Version 1.0

15.Scaling Uber’s Real-time Optimization with Flink How to aggregate geo-related events in real-time? — — — Uber 2018 Prepare for Flink Forward 2018 Version 1.0

16.Scaling Uber’s Real-time Optimization with Flink Online analytical processing (OLAP) — — — Uber 2018 Prepare for Flink Forward 2018 Version 1.0

17.Scaling Uber’s Real-time Optimization with Flink OLAP solution — — — — Uber 2018 Prepare for Flink Forward 2018 Version 1.0

18.Scaling Uber’s Real-time Optimization with Flink OLAP solution — Periodical crontab — — — Uber 2018 Prepare for Flink Forward 2018 Version 1.0

19.Scaling Uber’s Real-time Optimization with Flink OLAP solution — Periodical crontab Batch snapshot — — — Uber 2018 Prepare for Flink Forward 2018 Version 1.0

20.Scaling Uber’s Real-time Optimization with Flink Event driven solution — — — Uber 2018 Prepare for Flink Forward 2018 Version 1.0

21.Scaling Uber’s Real-time Optimization with Flink Event based solution (flatmap) Geo fanout first — — — — Uber 2018 Prepare for Flink Forward 2018 Version 1.0

22.Scaling Uber’s Real-time Optimization with Flink Event based solution (reduce) Event time agg later — — — — Uber 2018 Prepare for Flink Forward 2018 Version 1.0

23. Scaling Uber’s Real-time Optimization with Flink No more periodical queries ➔ Flexible windows and trigger strategy ➔ Compute triggered by events only ➔ Materialized result pushed to consumer — — — Uber 2018 Prepare for Flink Forward 2018 Version 1.0

24. Scaling Uber’s Real-time Optimization with Flink No more bottleneck ➔ Avoid single point of bottleneck in dataflow ➔ Better isolation and scale independently — — — Uber 2018 Prepare for Flink Forward 2018 Version 1.0

25.Scaling Uber’s Real-time Optimization with Flink Event driven design concern? — ● Excessive fanout vs shuffle-free ● Specific topology vs generic query — — — Uber 2018 Prepare for Flink Forward 2018 Version 1.0

26.Scaling Uber’s Real-time Optimization with Flink Event driven design concern? in flink — ● Excessive fanout vs shuffle-free ○ Virtual key ○ Memory management ● Specific topology vs generic query — — — Uber 2018 Prepare for Flink Forward 2018 Version 1.0

27.Scaling Uber’s Real-time Optimization with Flink Event driven design concern? in flink — ● Excessive fanout vs shuffle-free ○ Virtual key ○ Memory management ● Specific topology vs generic query ○ dataSteam API and dataflow language ○ Customized job per application — — — Uber 2018 Prepare for Flink Forward 2018 Version 1.0

28.Scaling Uber’s Real-time Optimization with Flink Online model updates — — — — Uber 2018 Prepare for Flink Forward 2018 Version 1.0

29.Scaling Uber’s Real-time Optimization with Flink In marketplace, there are the models that describe the world and the decision engines that act on those models — — — Uber 2018 Prepare for Flink Forward 2018 Version 1.0