- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
Alibaba’s common algorithm platform on Flink
展开查看详情
1 .Alibaba’s Common Algorithm Platform on Flink Xu Yang yangxu@alibaba-inc.com
2 .Agenda • Background • Why based on Flink/Blink • Pla7orm Introduc;on • Demos
3 .Background • Alibaba Group • Alibaba Compu;ng Pla7orm • PAI ( Pla7orm of AI )
4 .Why based on Flink/Blink • More requirements on stream processing • Advanced Flink architecture • User – Low learning curve – Less coding – More func;ons
5 .Alibaba’s Common Algorithm Platform on Flink • Code Name: Alink - Common part of related words - Alibaba, Algorithm, AI, Flink, Blink • Current supported algorithms - Sta;s;cs, Machine Learning, Recommenda;on, Outlier
6 .Alink Architecture Alink SDK & Web UI & Client & Visualiza)on Processing for Structural Data Stream Operator Batch Operator Stream Processing Batch Processing Maching Learning Common Libs Graph Processing Event Processing Alink ...... Alink ...... For Streaming For Streaming For Streaming Alink Stat Alink Stat Alink ML Alink ML Flink ML Rela;onal Rela;onal For Batch For Batch For Batch Table Table Gelly CEP DataStream API DataSet API Stream Processing Batch Processing Run)me Distributed Streaming Dataflow Local Cluster Cloud Single JVM Standalone YARN GCE EC2
7 .Alink UI • Web UI – Drag-drop, easy to build workflow • Client – Local run – Edit and run script • Console – Client without GUI
8 .
9 .
10 .
11 .
12 .
13 .
14 .
15 .
16 .
17 .Alink UI • Web UI – Drag-drop, easy to build workflow • Client – Local run – Edit and run script • Console – Client without GUI
18 .
19 .Local Run! Cluster Run!
20 .Alink UI • Web UI – Drag-drop, easy to build workflow • Client – Local run – Edit and run script • Console – Client without GUI
21 .Alink Functions (Part 1 of 3) • Sta;s;cs and Visualiza;on - Current and History - Basic Sta;s;c • Mean, Variance, StdVar, CV, StdErr, Moment, Central Moment, Skewness, Kurtosis • Histogram, TopK, Bo[omK, Frequency, Percen;le, Quan;le, Median, Mode • Covariance, Coef of Correla;on, Cross Table, Ranking List - Sta;s;cal Analysis • PCA, Correspondence Analysis, Mul;-collinearity • T-Test, Chi2-Test, KS-Test, AD-Test
22 .Demo for Statistics and Visualization • IJCAI-17 Dataset - h[ps://;anchi.aliyun.com/datalab/index.htm - Trading amounts and loca;ons of Alipay users - 19.6 million users, 67 million trades
23 .Stat Demo: Current and History • AllStat for History, stat from start to now • WindowStat for Current, stat over last 3 seconds • Trading amounts • Frequency of shop_level
24 .Data Frequent & Count Demo
25 .Stat Demo: Distribution • Get 2 stream data: shop_level=‘low’, shop_level=‘high’ • Consider 2 Features : comment_cnt and pay • Probabilis;c Distribu;on
26 .Stat Demo: Distribution
27 .Stat Demo: Relationship of Features • Numerical Features: pay, comment_cnt and shop_level_int - Mul;collinearity, Coef of Correla;on • Categorical Features: province and shop_level - Correspondence Analysis, Cross Table
28 .Stat Demo: Relationship of Features
29 .Stat Demo: Ranking List • provinces for user counts • provinces for sum of pay, showed in map • catalogs for sum of pay