基于APACHE MXNET和Apache Sead的大数据集分布式推理

随着数据的丰富、计算和存储的商品化,深度学习已经无处不在。预先训练的模型对于许多用例是容易获得的。分布式推理有许多应用,如离线预计算结果、用现有模型的预测对历史数据进行回填等。大规模数据集上的推理在分布式数据处理中面临许多挑战。
展开查看详情

1.Distributed Deep Learning Inference using Apache MXNet* and Apache Spark Naveen Swamy Amazon AI *

2.Outline • Review of Deep Learning • Apache MXNet Framework • Distributed Inference using MXNet and Spark

3.Deep Learning CAR PERSON DOG Output (object identity) 3rd hidden layer • Originally inspired by our biological (object parts) neural systems. 2nd hidden layer (corners & contours) • A System that learns important 1st hidden layer features from experience. (edges) Input layer • Layers of Neurons learning concepts. (Raw pixels) • Deep learning != deep understanding Credit: Ian Goodfellow etal., Deep Learning Book

4. Algorithmic Advances (Faster Learning) Abundance of Data High Performance Compute (Deeper Networks) GPUs (Faster Experiments) Bigger and Better Models = Better AI Products

5.Why does Deep Learning matter? Health care Autonomous Personal Assistants Vehicles Solve Intelligence ???

6.Deep Learning & AI, Limitations DL Limitations: Artificial Intelligence • Requires lots of data and compute power. Machine Learning • Cannot detect Inherent bias in data - Transparency. Deep Learning • Uninterpretable Results.

7. Deep Learning Training forward dog ? error dog backward labels data • Pass data through the network – forward pass forward pass w5 X1 w1 = 0.5 h1 =0 .4 w3 0.1 y = 1.0 • Define an objective – Loss function =0 y` = 0.9 5 .5 y 0. 0.1 loss = y – y` = 4 w 0.5 • Send the error back – backward pass w2 = 0.5 w6 = l = 0.1 X2 h2 backward pass Model: Output of Training a neural network

8.Deep Learning Inference forward model dog • Real time Inference: Tasks that require immediate result. • Batch Inference: Tasks where you need to run on a large data sets. o Pre-computations are necessary - Recommender Systems. o Backfilling with state-of-the art models. o Testing new models on historic data.

9.Types of Learning • Supervised Learning – Uses labeled training data learning to associate input data to output. Example: Image classification, Speech Recognition, Machine translation • Unsupervised Learning - Learns patterns from Unlabeled data. Example: Clustering, Association discovery. • Active Learning – Semi-supervised, human in the middle.. • Reinforcement Learning – learn from environment, using rewards and feedback.

10.Outline • Apache MXNet Framework • Distributed Inference using MXNet and Spark

11.Why MXNet

12.MXNet – NDArray & Symbol • NDArray– Imperative Tensor Operations that work on both CPU and GPUs. • Symbol APIs – similar to NDArray but adopts declarative programming for optimization. Symbolic Program Computation Graph

13.MXNet - Module High level APIs to work with Symbol 1) Create Graph 2) Bind 3) Pass data

14.Outline • Distributed Inference using MXNet and Spark

15.Distributed Inference Challenges High Performance DL framework • Similar to large scale data Distributed Cluster processing systems Resource Management Apache Spark: Job Management • Multiple Cluster Managers • Works well with MXNet. Efficient Partition of Data • Integrates with Hadoop & big data tools. Deep Learning Setup

16.MXNet + Spark for Inference. • ImageNet trained ResNet-18 classifier. • For demo, CIFAR-10 test dataset with 10K Images. • PySpark on Amazon EMR, MXNet is also available in Scala. • Inference on CPUs, can be extended to use GPUs.

17.Distributed Inference Pipeline mapPartitions download create RDD fetch batch decode to run collect S3 keys and of images numpy array prediction predictions on driver partition on executor initialize model only once

18. MXNet + Spark for Inference. On the driver

19.On the executor

20. Summary • Overview of Deep Learning o How Deep Learning works and Why Deep Learning is a big deal. o Phases of Deep Learning o Types of Learning • Apache MXNet – Efficient deep learning library o NDArray/Symbol/Module • Apache MXNet and Spark for distributed Inference.

21.What’s Next ? • Released simplified Scala Inference APIs (v1.2.0) o Available on Maven : org.apache.mxnet • Working on Java APIs for Inference. • Dataframe support is under consideration. • MXNet community is fast evolving, join hands to democratize AI.

22.Resources/References • https://github.com/apache/incubator-mxnet • Blog- Distributed Inference using MXNet and Spark • Distributed Inference code sample on GitHub • Apache MXNet Gluon Tutorials • Apache MXNet – Flexible and efficient deep learning. • The Deep Learning Book • MXNet – Using pre-trained models • Amazon Elastic MapReduce

23. Thank You nswamy@apache.org