利用Intel Analytics Zoo在大数据平台上构建深度学习应用
Outline
Outline
• Motivation and introduction
• Highlighted features and examples
• Read-world applications
• Conclusion
Trends & Motivations
• Data scale is driving deep learning process.
• Huge gap exists between deep learning and big data communities.

The Chasm
Deep learning experts Average users (big data users, data scientists, analysts, etc.)
"Machine Learning Yearning", Andrew Ng, 2016
Trends & Motivations
• Real-world ML/DL systems are complex big data analytics pipelines.
• Apache Hadoop and Spack drives big data solutions.

Hidden Technical Debt in Machine Learning Systems", Google, NIPS 2015 paper
BigDL Bringing Deep Learning To Big Data Platforms
• Distributed deep learning framework for Apache Spark*
• Make deep learning more accessible to big data users and data scientists:
• Write deep learning applications as standard Spark programs. DataFrame
• Run on existing Spark/Hadoop clusters (no changes needed). ML Pipeline
• Feature parity with popular deep learning frameworks: SQL SparkR Streaming
• E.g., Caffe, Torch, TensorFlow, etc. MLlib GraphX
• High performance (on CPU): Spark Core
• Powered by Intel MKL and multi-threaded programming. https://github.com/intel-analytics/BigDL
• Efficient scale-out: https://bigdl-project.github.io/
• Leveraging Spark for distributed training & inference.
Analytics Zoo Unified Analytics + AI Platform for Big Data
Distributed TensorFlow, Keras and BigDL on Spark
• Distributed TensorFlow and Keras on Spark.
High-Level Pipeline APIs
• Native support for transfer learning, Spark DataFrame and ML Pipelines.
• Model serving API for inference pipelines.
Feature Engineering Feature transformations for image, 3D image, text, time series, speech, etc.
Built-in Deep Learning Models
Image classification, object detection, text classification, text matching, sequence-to-sequence, recommendations, anomaly detection, etc.
Anomaly detection, sentiment analysis, fraud detection, image generation,
Reference Use Cases
chatbot, etc.
Backends Spark, TensorFlow, Keras, BigDL, MKL-DNN, etc.
https://github.com/intel-analytics/analytics-zoo/
https://analytics-zoo.github.io/
Keras-Style API
Use Keras-Style API to create an Analytics Zoo model and train, evaluate or tune it in a distributed fashion.

from zoo.pipeline.api.keras.models import Sequential
from zoo.pipeline.api.keras.layers import *

model = Sequential()
model.add(Reshape((1, 28, 28), input_shape=(28, 28, 1)))
model.add(Convolution2D(6, 5, 5, activation="tanh", name="conv1_5x5"))
model.add(MaxPooling2D())
model.add(Convolution2D(12, 5, 5, activation="tanh", name="conv2_5x5"))
model.add(MaxPooling2D())
model.add(Flatten())
model.add(Dense(100, activation="tanh", name="fc1"))
model.add(Dense(10, activation="softmax", name="fc2"))
model.compile("sparse_categorical_crossentropy", "adadelta", ["accuracy"])
model.set_tensorboard(log_dir, app_name)
model.set_checkpoint(path)
model.fit(x, y=None, batch, epochs, validation_data)
model.predict(x, batch)
model.predict_classes(x, batch)
model.evaluate(x, y=None, batch)
Auto-grad API
Autograd API provides automatic differentiation for math operations to easily define custom layers or losses.

import zoo.pipeline.api.autograd as A

log = A.log(in_node + 1.0)
dot = A.batch_dot(embed1, embed2, axes=[2, 2])

from zoo.pipeline.api.autograd import *

def mean_absolute_error(y_true, y_pred):
    result = mean(abs(y_true - y_pred), axis=1)
    return result
Transfer Learning API
Use transfer learning APIs to easily customize pre-trained models for feature extraction or fine-tuning:

from zoo.pipeline.api.net import *
from zoo.pipeline.api.keras.layers import Dense, Input, Flatten
from zoo.pipeline.api.keras.models import Model

# Load a pre-trained inception model
full_model = Net.load_caffe(def_path, model_path)
# Remove the last few layers
model = full_model.new_graph(outputs=["pool5/drop_7x7_s1"]).to_keras())
# Freeze the first few layers
model.freeze_up_to(["pool4/3x3_s2"])
# Append a few layers
input = Input(shape=(3, 224, 224))
inception= model.to_keras()(input)
flatten = Flatten()(inception)
logits = Dense(2)(flatten)
new_model = Model(input, logits)
Working with Image
1. Read images into local or distributed ImageSet

from zoo.common.nncontext import init_nncontext
from zoo.feature.image import *

sc = init_nncontext()
local_image_set = ImageSet.read(image_path)
distributed_image_set = ImageSet.read(image_path, sc, 2)

2. Image augmentations using built-in ImageProcessing operations

transformer = ChainedPreprocessing([ImageBytesToMat(),
                                    ImageColorJitter(),
                                    ImageExpand(max_expand_ratio=2.0),
                                    ImageResize(300, 300, -1),
                                    ImageHFlip()])
transformed_local_image_set = transformer(local_image_set)
transformed_distributed_image_set = transformer(distributed_image_set)

Image Augmentations Using Built-in Image Transformations (w/ OpenCV on Spark)
Working with Text
1. Read text into local or distributed TextSet

from zoo.common.nncontext import init_nncontext
from zoo.feature.text import *

sc = init_nncontext()
local_text_set = TextSet.read(text_path)
distributed_text_set = TextSet.read(text_path, sc, 2)

2. Build text transformation pipeline using built-in operations

transformed_text_set = text_set.tokenize() \
                               .normalize() \
                               .word2idx() \
                               .shape_sequence(len)
nnframes Native DL support in Spark DataFrames and ML Pipelines
1. Initialize NNContext and load images into DataFrames using NNImageReader

from zoo.common.nncontext import *
from zoo.pipeline.nnframes import *

sc = init_nncontext()
imageDF = NNImageReader.readImages(image_path, sc)

2. Process loaded data using DataFrame transformations

getName = udf(lambda row: ...)
df = imageDF.withColumn("name", getName(col("image")))

3. Processing image using built-in feature engineering operations

from zoo.feature.image import *
transformer = ChainedPreprocessing(
    [RowToImageFeature(), ImageChannelNormalize(123.0, 117.0, 104.0),
     ImageMatToTensor(), ImageFeatureToTensor()])
nnframes Native DL support in Spark DataFrames and ML Pipelines
4. Define model using Keras-style API

from zoo.pipeline.api.keras.layers import *
from zoo.pipeline.api.keras.models import *

model = Sequential()
    .add(Convolution2D(32, 3, 3, activation='relu', input_shape=(1, 28, 28))) \
    .add(MaxPooling2D(pool_size=(2, 2))) \
    .add(Flatten()).add(Dense(10, activation='softmax')))

5. Train model using Spark ML Pipelines

estimater = NNEstimater(model, CrossEntropyCriterion(), transformer) \
    .setLearningRate(0.003).setBatchSize(40).setMaxEpoch(1) \
    .setFeaturesCol("image").setCachingSample(False)
nnModel = estimater.fit(df)
Distributed TensorFlow on Spark in Analytics Zoo
1. Data wrangling and analysis using PySpark

from zoo.common.nncontext import init_nncontext
from zoo.pipeline.api.net import TFDataset

sc = init_nncontext()

# Each record in the train_rdd consists of a list of NumPy ndrrays
train_rdd = sc.parallelize(file_list)
    .map(lambda x: read_image_and_label(x))
    .map(lambda image_label: decode_to_ndarrays(image_label))

# TFDataset represents a distributed set of elements,
# in which each element contains one or more TensorFlow Tensor objects.
dataset = TFDataset.from_rdd(train_rdd,
                             names=["features", "labels"],
                             shapes=[[28, 28, 1], [1]],
                             types=[tf.float32, tf.int32],
                             batch_size=…)
Distributed TensorFlow on Spark in Analytics Zoo
2. Deep learning model development using TensorFlow

import tensorflow as tf
slim = tf.contrib.slim

images, labels = dataset.tensors
labels = tf.squeeze(labels)
with slim.arg_scope(lenet.lenet_arg_scope()):
    logits, end_points = lenet.lenet(images, num_classes=10, is_training=True)

loss = tf.reduce_mean(tf.losses.sparse_softmax_cross_entropy(logits=logits, labels=labels))
Distributed TensorFlow on Spark in Analytics Zoo
3. Distributed training on Spark and BigDL

from bigdl.optim.optimizer import MaxIteration, Adam, MaxEpoch, TrainSummary
from zoo.pipeline.api.net import TFOptimizer

optimizer = TFOptimizer.from_loss(loss, Adam(1e-3))
optimizer.set_train_summary(TrainSummary("/tmp/az_lenet", "lenet"))
optimizer.optimize(end_trigger=MaxEpoch(5))

4. For Keras users

optimizer = TFOptimizer.from_keras(keras_model, dataset)
optimizer.optimize(end_trigger=MaxEpoch(5))

predictor = TFPredictor.from_keras(model, dataset)
predictions_rdd = predictor.predict()
Models Interoperability Support
Load existing TensorFlow, Keras, Caffe, Torch, ONNX models:
• Useful for inference and model fine-tuning.
• Allows for transition from single-node for distributed application deployment.
• Allows for model sharing between data scientists and production engineers.

from zoo.pipeline.api.net import Net

Net.load_tf(path, inputs=None, outputs=None, byte_order="little_endian", bin_file=None)
Net.load_keras(hdf5_path, json_path=None, by_name=False)
Net.load_caffe(def_path, model_path)
Net.load_torch(path)
POJO-Style Model Serving API

import com.intel.analytics.zoo.pipeline.inference.AbstractInferenceModel;

public class TextClassification extends AbstractInferenceModel {
    public RankerInferenceModel(int concurrentNum) {
        super(concurrentNum);
    }
    ...
}

public class ServingExample {
    public static void main(String[] args) throws IOException {
        TextClassification model = new TextClassification();
        model.load(modelPath, weightPath);
        texts = …
        List<JTensor> inputs = preprocess(texts);
        for (JTensor input : inputs) {
            List<Float> result = model.predict(input.getData(), input.getShape());
            ...
        }
    }
}
Built-in Deep Learning Models
• Object detection (SSD, Faster-RCNN, etc.)
• Image classification (VGG, Inception, ResNet, MobileNet, etc.)
• Text classification (Using CNN, LSTM, etc.)
• Text matching (For either ranking or classification.)
• Sequence-to-sequence (Encoder-decoder structure using RNN.)
• Recommendation (Neural Collaborative Filtering, Wide and Deep Learning, etc.)
• Anomaly detection (Unsupervised time series anomaly detection using LSTM.)
Object Detection API
1. Load pretrained model in Detection Model Zoo

from zoo.common.nncontext import *
from zoo.models.image.objectdetection import *

sc = init_nncontext()
model = ObjectDetector.load_model(model_path)

2. Off-the-shell inference using the loaded model

image_set = ImageSet.read(img_path, sc)
output = model.predict_image_set(image_set)

3. Visualize the results using utility methods

config = model.get_config()
visualizer = Visualizer(config.label_map(), encoding="jpg")
visualized = visualizer(output).get_image(to_chw=False).collect()

Off-the-shell Inference Using Analytics Zoo Object Detection API
https://github.com/intel-analytics/analytics-zoo/tree/master/pyzoo/zoo/examples/objectdetection
Reference Use Cases
• Anomaly Detection
• Using LSTM network to detect anomalies in time series data.
• Fraud Detection
• Using feed-forward neural network to detect frauds in credit card transaction data.
• Recommendation
• Use Analytics Zoo Recommendation API on data with explicit feedback.
• Sentiment Analysis
• Using neural network models (e.g. CNN, GRU, Bi-LSTM) to handle user feedbacks.
• Variational Autoencoder
• Using VAE to generate faces and digital numbers.
• Web Services
• Using Analytics Zoo model serving APIs for model inference in web servers.
https://github.com/intel-analytics/analytics-zoo/tree/master/apps
Object Detection and Image Feature Extraction at JD.com
Similar Image Search

Query

Search Result

Query Search Result
--- ---
Source: "Bringing deep learning into big data analytics using BigDL", Xianyan Jia and Zhenhua Wang, Strata Data Conference Singapore 2017
Challenges of Productionizing Large-Scale Deep Learning Solutions
• Very complex and error-prone in managing large-scale distributed systems
• E.g., resource management and allocation, data partitioning, task balance, fault tolerance, model deployment, etc.
• Low end-to-end performance in GPU solutions
• E.g., reading images out from HBase takes about half of the total time
• Very inefficient to develop the end-to-end processing pipeline
• E.g., image
25 .Production Deployment with Analytics Zoo for Spark and BigDL 区 cn 社 g. 能 cu 智 ai 工 w. G人 ww CU AI • Reuse existing Hadoop/Spark clusters for deep learning with no changes (image search, IP protection, etc.) • Efficiently scale out on Spark with superior performance (3.83x speed-up vs. GPU severs) as benchmarked by JD http://mp.weixin.qq.com/s/xUCkzbHK4K06-v5qUsaNQQ https://software.intel.com/en-us/articles/building-large-scale-image-feature-extraction-with-bigdl-at-jdcom
26 . NLP Based Customer Service Chatbot for Microsoft Azure 区 cn 社 g. 能 cu 智 ai 工 w. G人 ww CU AI https://software.intel.com/en-us/articles/use-analytics-zoo-to-inject-ai-into-customer-service- platforms-on-microsoft-azure-part-1
27 . Real-World Applications 区 Industrial Inspection Platform in Midea* and KUKA* cn 社 https://software.intel.com/en-us/articles/industrial-inspection-platform-in-midea-and-kuka-using- g. 能 distributed-tensorflow-on-analytics cu 智 ai 工 Image Similarity Based House Recommendation for MLSlistings* w. G人 https://software.intel.com/en-us/articles/using-bigdl-to-build-image-similarity-based-house- recommendations ww CU Building users-items propensity models for Mastercard* AI Service AI https://software.intel.com/en-us/articles/deep-learning-with-analytic-zoo-optimizes-mastercard- recommender-ai-service Job Recommendations in Jobs2Career* https://software.intel.com/en-us/articles/talroo-uses-analytics-zoo-and-aws-to-leverage-deep- learning-for-job-recommendations
28 . 区 cn 社 g. 能 cu 智 ai 工 w. G人 ww CU AI Unified Analytics + AI Platform Distributed TensorFlow, Keras and BigDL on Apache Spark https://github.com/intel-analytics/analytics-zoo
