Zeppelin 2019 年机器学习最新特性和规划

  1. zeppelin 在机器学习方面最新的一些进展和后期的规划;
  2. zeppelin 即将发布的 0.9.0 版本中最新的特性:
    • 分布式运行模式
    • Docker运行模式
    • 全新设计的 Terminal
    • Hadoop Submarine interpreter

1.ZEPPELIN 机器学习最新特性和规划 刘勋 Apache Zeppelin Committer

2.自我介绍 刘勋 Apache Zeppelin Committer Apache Hadoop Submarine Project Team Member Staff Engineer @NetEase

3.目录 What Is Apache Zeppelin? Zeppelin Machine Learnine Zeppelin New Feauter

4.WHAT IS APACHE ZEPPELIN ? Data Ingestion Data Discovery Data Analytics Data Visualization Data Collaboration

5.Multiple Language Backend • Concept allows any language/data- processing-backend to be plugged into Zeppelin. • Currently Apache Zeppelin supports many interpreters such as Apache Spark, Python, JDBC, Markdown , Shell and … • Adding new language-backend is really simple.

6. Data visualization Some basic charts are already included in Apache Zeppelin. Visualizations are not limited to SparkSQL query, any output from any language backend can be recognized and visualized.

7.Pivot chart • Apache Zeppelin aggregates values and displays them in pivot chart with simple drag and drop. • You can easily create chart with multiple aggregated values including sum, count, average, min, max.

8.Dynamic forms Apache Zeppelin can dynamically create some input forms in your notebook.

9.Collaborate by sharing your Notebook & Paragraph Your notebook URL can be shared among collaborators. Then Apache Zeppelin will broadcast any changes in realtime, just like the collaboration in Google docs.

10.目录 What Is Apache Zeppelin? Zeppelin Machine Learnine Zeppelin New Feauter

11. Zeppelin Architecture Interactive Zeppelin Development Computing Tensorflow PyTorch Python R / Scala Hive Spark Flink Engine Resource Kubernetes YARN Zeppelin Cluster Manager Infrastructure HDFS AWS S3 Docker CPU GPU

12.Machine Learning in a Unified Platform

13. Machine learning workflow Feature Model Exper- Experiment Selection Training iment Feature Model Model as Transform Evaluation Service Model Data Feature Feature Model Real-time Database Encoding Validation Feature Feature Model Online Calibration Evaluation Staging Feature Data Preprocessing Model Training Online Service Feature Engineering

14. Data Preprocessing & Feature Engineering Import data - HDFS Feature - AWS S3 Selection - RDBMS Feature Transform Join Data Data Feature Encoding Data exploration Feature Evaluation Data sample Data Preprocessing Training / Test Feature Engineering

15. Model Training Traditional machine Deep learning models learning models Model - DNN - Logistic Regression Training - CNN - Gradient boosting tree Model - RNN - Recommendation/ALS Evaluation - LSTM - LDA Feature Model Validation Libraries Model Libraries - Python Lib Staging - TensorFlow - Apache Spark MLlib - PyTroch - XGBoost - MXNet Model Training

16.Top Python Scala R Librarines in Data Science

17.Model Serving Model Manager Model depoly Exper- Experiment iment Model serving Model as Service - Batch Model - Streaming Database Real-time Feature Exploration Online Calibration - offline Feature - online (A / B test) Online Service

18. Zeppelin Integration Hadoop Submarine Algorithm develop Job scheduling Tensorboard Monitor {Submarine} CLI / REST User


20.Submarine Integration Zeppelin


22.Model Serving (ZEPPELIN-3994)

23.目录 What Is Apache Zeppelin? Zeppelin Machine Learnine Zeppelin New Feauter

24.Zeppelin Cluster Mode (ZEPPELIN-3471) 1RWHERRN5HSR 1RWHERRN5HSR 6KGIVŏ 6KGIVŏ =HSSHOLQ&OXVWHU =HSSHOLQ&OXVWHU interpreter- interpreter- interpreter- interpreter- GHOHWHLQWSPHWD UHFRQQHFW,QWSWKULIW interpreter- process1 process1; process1 process1 processM QHZLQWSPHWD Cluster MetaData Cluster MetaData WKULIW WKULIW LS SRUW WKULIW WKULIW0 LS SRUW interpreter-process1 UHQHZLQWS interpreter-process1 thrift1(ip&port) JHW,QWS0HWD thrift1(ip&port) 5DIW 5DIW 5DIW 5DIW interpreter-processM interpreter-process1 ; ; interpreter-processM thriftM(ip&port) thrift1(ip&port) thriftM(ip&port) ]HSSHOLQ6HUYHU zepl-Server1 exception ; ]HSSHOLQ6HUYHU ]HSSHOLQ6HUYHU ]HSSHOLQ6HUYHU1 ]HSSHOLQ6HUYHU1 zepl-Server1 exception ; ‫૲ړ‬ୗ Distributed zeppelin Zeppelin Cluster6HUYHU ,QWHUSUHWHU3URFHVV਻Კᐏ఺ࢶ zeppelin Server & Interpreter Process fault architecture diagram Distributed tolerance zeppelin Server fault diagram tolerance diagram ‫૲ړ‬ୗ zeppelin ᔮᕹຝ຅ࢶ ᧔กғԅԧๅႴศᅩ᧔ก॒ቘၞᑕ҅‫ڢ‬ᴻԧ޾๐‫਻ۓ‬Კ෫‫ى‬ጱٖ਻̶ ‫૲ړ‬ୗ 1. zeppelin 6HUYHU਻Კᐏ఺ࢶ 0XOWLSOH]SSHOLQ6HUYHUV =HSO6HUYHU=HSO6HUYHU=HSO Description: In order to explain the process more clearly, the content Description: In order to explain the process more clearly, the content that is 6HUYHU1 DUHEXLOWLQWRWKH=HSSHOLQ&OXVWHUE\5DIWDOJRULWKP7KH 1. ਖ਼ग़‫ݣ‬ that related tozppelin is not relatedServerҁࢶӾғ=HSO6HUYHU̵=HSO6HUYHU̵ service to service fault fault is tolerance tolerance deleted.is deleted. 1. ୮]HSO6HUYHU޾,QWHUSUHWHU3URFHVV᮷‫ݎ‬ኞ୑ଉ෸҅1J[LQտ༄ၥ‫ک‬ not ᧔กғԅԧๅႴศᅩ᧔ก॒ቘၞᑕ҅‫ڢ‬ᴻԧ޾๐‫਻ۓ‬Კ෫‫ى‬ጱٖ਻̶ 5DIWDOJRULWKPHQVXUHVWKDWDOO=HSSHOLQ6HUYHUVFDQDFFHVVFOXVWHU =HSO6HUYHU1҂᭗ᬦ Raft ᓒဩᕟୌ౮ Zeppelin Cluster҅5DIWᓒဩᏟ ]HSO6HUYHU‫ݎ‬ኞ୑ଉਖ਼ਙӥᕚҔ,QWHUSUHWHU3URFHVVᑕଧᭅ‫ڊ‬෸տࣁ PHWDGDWD&OXVWHU0HWD ‫כ‬ಅํ=HSSHOLQ6HUYHU᮷‫ݢ‬զӞᛘ௔ጱᦢᳯᵞᗭ‫ز‬හഝ&OXVWHU 1. When zepl-Server1 has in anboth exception 1.&OXVWHU0HWDӾ‫ڢ‬ᴻᛔ૩ጱ‫ز‬හഝ҅ইຎ,QWHUSUHWHU3URFHVVᑕଧᭅ‫ڊ‬ဌ When an exception occurs and Interpreter zepl-Server1 Process and Interpreter is Process, QJLQ[PDVWHU 1. ୮]HSO6HUYHU‫ݎ‬ኞ୑ଉ҅ᘒ,QWHUSUHWHU3URFHVV‫ݢ‬አ෸҅1J[LQ available, Ngxin QJLQ[PDVWHU 0HWD̶ Ngxin will detect thatwill detect that has zepl-Server1 zepl-Server1 an exception has and an exception and bring it offline; ํ౮‫ڢۑ‬ᴻ‫ز‬හഝٌ҅՜ጱ]HSSHOLQ6HUYHUԞտ᭗ᬦ؋଼༄ັ҅ਖ਼ӧ‫ݢ‬አ տ༄ၥ‫]ک‬HSO6HUYHU‫ݎ‬ኞ୑ଉਖ਼ਙӥᕚҔ bring QJLQ[EDFNXS QJLQ[EDFNXS QJLQ[EDFNXS when theitInterpreter offline. Process program exits, it will delete its own metadata %\SUR[\LQJPXOWLSOH]SSHOLQVWKURXJK1JLQ[DQGPDSSLQJWRWKH ጱ‫ز‬හഝᬰᤈ‫ڢ‬ᴻҔ QJLQ[VHUYHUYLDGRPDLQQDPH\RXFDQDFFHVVRQHRIWKHPXOWLSOH in Cluster ਖ਼ग़‫ݣ‬ Meta, if the᭗ᬦ zppelin Interpreter Process program exits without Metadata is NJLQ[؉դቘ҅᭗ᬦऒ‫ݷ‬ฉ੘‫ک‬QJLQ[๐‫҅࢏ۓ‬ 2.୮ܻ๶Ӟፗᦢᳯ]HSO6HUYHU୮አಁҁইғ8VHU҂ེٚಗᤈQRWH When you have been accessing zepl-Server1 =HSSHOLQ6HUYHUVEHKLQG1J[LQWKURXJKWKHGRPDLQQDPH successfully deleted, and other zeppelin Servers also allpass the time, when check the health the ੪‫ݢ‬զ᭗ᬦऒ‫ᦢݷ‬ᳯ1J[LQ‫ݸ‬ᶎጱग़‫=ݣ‬HSSHOLQ6HUYHUӾጱٌӾӞ to୮ܻ๶Ӟፗᦢᳯ]HSO6HUYHU୮አಁҁইғ8VHU҂ེٚಗᤈQRWHጱ෸ user delete (such as User1) executes the unavailable metadata.note again, Nginx will redirect the user's ጱ෸‫҅ײ‬1JLQ[տਖ਼አಁጱKWWS᧗࿢᫨‫ٌݻ‬՜ྋଉጱ]HSSHOLQ6HUYHU VZLWFK VZLWFK ‫̶ݣ‬ ‫҅ײ‬1JLQ[տਖ਼አಁጱKWWS᧗࿢᫨‫ٌݻ‬՜ྋଉጱ]HSSHOLQ6HUYHUӾ݄҅ http request to other normal zeppelin Server, as shown in the figure Ӿ݄҅ইࢶӾಅᐏጱ=HSO6HUYHUҔ :KHQXVHUVVXFKDV8VHUDQG8VHUDFFHVV=HSSHOLQ6HUYHU 2.ইࢶӾಅᐏጱ=HSO6HUYHUҔ Zepl-Server2; When you have been accessing zepl-Server1 all the time, when the user WKURXJKWKH,QWHUQHW1J[LQORJVWKHXVHUWRDGLƈHUHQW=HSSHOLQ ୮አಁ8VHU̵8VHUᒵአಁ᭗ᬦ,QWHUQHWᦢᳯ=HSSHOLQ6HUYHU (such as User1) executes note again, Nginx will redirect the user's http =HSO6HUYHUӾဌํአಁ8VHUጱ,QWHUSUHWHU3URFHVVզ݊ 6HUYHUDFFRUGLQJWRWKHGLVWULEXWLRQSROLF\$VVKRZQLQWKHƉJXUH ෸҅1J[LQ໑ഝ‫ݎړ‬ᒽኼ҅ਖ਼አಁጭ୯‫ک‬ӧ‫ݶ‬ጱ=HSSHOLQ6HUYHUӾ҅ 3. There is nonormal Interpreter Process andasSession to other zeppelin Server, 6HVVLRQ‫=҅௳מ‬HSO6HUYHUḒ‫ض‬᭗ᬦࣁᵞᗭ‫ز‬හഝ&OXVWHU0HWDӾ shown information =HSO6HUYHUӾဌํአಁ8VHUጱ,QWHUSUHWHU3URFHVVզ݊6HVVLRQ request in the figurefor user Zepl- 8VHUXVHV=HSO6HUYHUDQG8VHUXVHV=HSO6HUYHU1 ইࢶӾಅᐏ҅8VHUֵአ=HSO6HUYHU҅8VHUֵአ=HSO6HUYHU1̶ User1 in Zepl-Server2. Zepl-Server2 first looks up the Interpreter Server2; ‫=҅௳מ‬HSO6HUYHUḒ‫ض‬᭗ᬦࣁᵞᗭ‫ز‬හഝ&OXVWHU0HWDӾັತ8VHU ັತ8VHUጱ,QWHUSUHWHU3URFHVV‫ز‬හഝ҅ইຎತ‫ک‬ԧ҅᧔กಅᵱᥝ Process metadata of User1 in the cluster metadata Cluster Meta. If ጱ,QWHUSUHWHU3URFHVV‫ز‬හഝ҅ဌํತ‫҅ک‬ᮎԍୌࣁ=HSO6HUYHU᯿ෛ ጱ,QWHUSUHWHU3URFHVVՖᆐࣁ=HSSHOLQ&OXVWHUӾਂࣁҔ found,istheno required Interpreter Process is still andinSession Zeppelininformation 1J[LQGHWHUPLQHVWKDWWKHXVHULVFRQQHFWHGWRDYDOLG=HSSHOLQ 1J[LQ᭗ᬦ༄ັ=HSSHOLQ&OXVWHUӾጱ=HSSHOLQ6HUYHUฎ‫ݢވ‬զᦢ 3. There user User1 Interpreter Process Cluster. Exist in ]HSOQHWHDVHFRP ‫ڠ‬ୌ5HPRWH,QWHUSUHWHU5XQQLQJ3URFHVVҔ ]HSOQHWHDVHFRP in 6HUYHUE\FKHFNLQJLIWKH=HSSHOLQ6HUYHULQWKH=HSSHOLQ&OXVWHULV Zepl-Server2. Zepl-Server2 first finds User1's Interpreter Process metadata ᳯ҅٬ਧአಁᬳള‫ํک‬පጱ=HSSHOLQ6HUYHUӾ̶ =HSO6HUYHU᭗ᬦ឴‫ݐ‬ጱ8VHUጱ,QWHUSUHWHU3URFHVV‫ز‬හഝ‫מ‬ DFFHVVLEOH in the cluster metadata Cluster Meta. If it is not found, it is built in Zepl- Server2௳Ӿጱ7KULIW,3 3RUW҅᯿ෛ‫ڠ‬ୌ =HSO6HUYHUUHFUHDWHV5HPRWH,QWHUSUHWHU5XQQLQJ3URFHVVE\ to re-create RemoteInterpreterRunningProcess. አಁ᭗ᬦ=HSSHOLQ‫ڠ‬ୌᛔ૩ᵱᥝጱᥴ᯽࢏ᬰᑕ҅,QWHUSUHWHU 5HPRWH,QWHUSUHWHU5XQQLQJ3URFHVV҅՗=HSO6HUYHUಅࣁጱ๐‫࢏ۓ‬ REWDLQLQJ7KULIW,3 3RUWLQ8VHU V,QWHUSUHWHU3URFHVVPHWDGDWD 7KHXVHUFUHDWHVWKHLQWHUSUHWHUSURFHVVKHQHHGVWKURXJK 3URFHVVտਖ਼ᛔ૩൉‫=׀‬HSSHOLQ6HUYHUᬳളጱ7KULIW,3 3RUWᬯԶ Ӿᬳളᬦ݄҅੪‫ݢ‬զ᯿ෛֵአզ‫ڠڹ‬ୌᬦጱ,QWHUSUHWHU3URFHVV̶ LQIRUPDWLRQDQGFRQQHFWVWRWKHVHUYHUZKHUH=HSO6HUYHULV =HSSHOLQDQGWKH,QWHUSUHWHU3URFHVVVDYHVWKHPHWDGDWD ‫ز‬හഝ‫&کਂכ௳מ‬OXVWHU0HWDӾ̶ ORFDWHGWRUHXVHWKHSUHYLRXVO\FUHDWHG,QWHUSUHWHU3URFHVV LQIRUPDWLRQRIWKH7KULIW,3 3RUWWKDWSURYLGHVWKH=HSSHOLQ6HUYHU 8VHU 8VHU 8VHU FRQQHFWLRQWR&OXVWHU0HWD 8VHU

25.Zeppelin Cluster Mode (ZEPPELIN-3471)

26.Zeppelin Cluster Mode (ZEPPELIN-3471)

27.Zeppelin Cluster Mode (ZEPPELIN-3471)

28.Zeppelin Cluster Mode (ZEPPELIN-3471)

29. Zeppelin Cluster + Docker (ZEPPELIN-4104) 1RWHERRN5HSR 6KGIVŏ =HSSHOLQ&OXVWHU 'RFNHU&RQWDLQHU 'RFNHU&RQWDLQHU interpreter- interpreter- processM process1 Cluster MetaData interpreter-process1 5DIW 5DIW thrift1(ip&port) interpreter-processM ]HSSHOLQ6HUYHU ]HSSHOLQ6HUYHU ]HSSHOLQ6HUYHU1 thriftM(ip&port) QJLQ[PDVWHU QJLQ[EDFNXS VZLWFK ]HSOQHWHDVHFRP 8VHU 8VHU