- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
三个深度学习框架的故事:TysFROW、Keras和深度学习管道
展开查看详情
1 .A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, and Deep Learning Pipelines Brooke Wenig Jules S. Damji Spark + AI Summit, SF 6/5/2018
2 . About Us . . . Jules S. Damji Brooke Wenig Apache Spark Developer & Community Databricks Machine Learning Instructor Advocate @Databricks Data Science Solution Consultant @ Databricks Program Chair Spark + AI Summit Software Engineering @ Splunk & MyFitnessPal Software engineering @ Sun Microsystems, Netscape, @Home, VeriSign, Scalix, Centrify, MS Machine Learning (UCLA) LoudCloud/Opsware, ProQuest Fluent in Chinese https://www.linkedin.com/in/dmatrix https://www.linkedin.com/in/brookewenig/ @2twitme
3 .Agenda for Today’s Talk • Impact of Big Data • Why Apache Spark? • Short Survey of 3 DL Frameworks • TensorFlow • Keras • Deep Learning Pipelines • Demo • Q&A
4 .What has Big Data Done to Us? Source : MIT Permeated our lives
5 .Hardest Part of AI isn’t AI, it’s Data “Hidden Technical Debt in Machine Learning Systems,” Google NIPS 2015 Data Machine Resource Monitoring Verification Management Data Collection Serving Configuration Infrastructure ML Code Analysis Tools Feature Process Extraction Management Tools Figure 1: Only a small fraction of real-world ML systems is composed of the ML code. The required surrounding infrastructure is vast and complex.
6 .What’s Apache Spark & Why
7 .Apache Spark: The First Unified Analytics Engine Uniquely combines Data & AI technologies Runtime Delta Spark Core Engine Big Data Processing Machine Learning ETL + SQL + Streaming MLlib + SparkR
8 .Survey of Three Deep Learning Frameworks
9 .What’s TensorFlow? • Open source from Google, 2015 • Current v1.8 API • Fast: Backend C/C++ • Data flow graphs • Nodes are functions/operators • Edges are input or data (tensors) • Lazy execution • Eager execution (1.7)
10 .TensorFlow Programming Stack Use canned estimators Build models Keras Models CPU GPU TPU Android iOS …
11 .Why TensorFlow: Community • 100K+ stars! • 11M downloads • Popular open-source code AF AF • TensorFlow Hub & Blog ○ Code Examples & Tutorials! ○ Learn + share from others
12 .Why TensorFlow: Tools AF AF • TensorBoard • Deploy + Serve Models • Visualize Tensors flow
13 .TensorFlow: We Get it … So What? • Steep learning curve, but powerful!! • Low-level APIs, but offers control!! • Expert in Machine Learning, just learn!! • Yet, high-level Estimators help, you bet!! • Better, Keras integration helps, indeed!!
14 .What’s Keras? • Open source Python Library APIs for Deep Learning • Current v2.1.6 APIs François Chollet (Google) • API spec: TensorFlow, CNTK and Theano • Easy to Use High-Level Declarative APIs! • Build layers – Great for Neural Network Applications • Fast Experimentation, Modular & Extensible!
15 .Keras Programming Stack Keras API Specification Use canned estimators TF-Keras Theano-Keras CNTK ..... Specific Impl models TensorFlow Workflow CPU GPU TPU Android iOS …
16 .Why Keras? • Focuses on Developer Experience • Popular & Broader Community • Supports multiple backends • Modularity • Sequential Layers • Multi-layer input networks model = Sequential() model.add(Dense(32, input_dim=784)) model.add(Activation('relu')) model.add(Dense, 32, activation=’softmax’) ...
17 . Transfer Learning & Deep Learning Pipelines
18 .What’s Transfer Learning? • Training from scratch requires • Enormous amounts of data • A lot of compute resources & time Intermediate representations learned for one task may be useful for other related tasks IDEA
19 .Trained Model GIANT PANDA 0.9 SoftMax RACCOON 0.05 RED PANDA 0.01 …
20 .Transfer Learning as a Pipeline Classifier Dog/Cat?
21 . When to use Transfer Learning? • Dataset is small & similar • Dataset is large & similar • Dataset is small but different • Dataset is large and different Source: Andrej Karpathy’s Transfer Learning
22 .What & Why Deep Learning Pipelines (DLP)? • Open source from Databricks, 2017 • Current v1.0 APIs w/ Apache Spark 2.3 • Primarily in Python • Ease of Use & Integration • Spark MLlib Pipelines & DataFrames • TensorFlow & Keras • SQL – Deploying & Evaluating • Distributed Hyperparameter Tuning • Easy for Transfer Learning
23 . DEMO https://dbricks.co/dlf_sai_2018
24 .Takeaways: Which One & What Language?
25 .Takeaways: When to Use TF, Keras or DLP • Low-level APIs & Control • Integration with Spark • Visualize with • High-level APIs MLlib Pipelines & TensorBoard • TensorFlow Backend DataFrames • Train Models or Transfer Learning • Love Python • Integrated with TF & • Model Serving • Train models or Keras transfer learning • Transfer Learning TensorFlow Keras Deep Learning Pipelines
26 .Resources Blog posts Talk, & webinars (http://databricks.com/blog) • Deep Learning Pipelines • GPU acceleration in Databricks • Deep Learning and Apache Spark • Build Scalable Deep Learning Pipelines • Deep Learning course: fast.ai • TensorFlow Tutorials • TensorFlow Dev Summit • Keras/TensorFlow Tutorials • MLFlow.org Docs for Deep Learning on Databricks (http://docs.databricks.com) • Deep Learning Pipelines Example • Apache Spark integration
27 . Thank You! Questions? brooke@databricks.com jules@databricks.com (@2twitme)