Burger King使用RayOnSpark进行基于实时情景特征的快餐食品推荐使用

播放视频

视频文档

Burger King使用RayOnSpark进行基于实时情景特征的快餐食品推荐使用

下载 2

快召唤伙伴们来围观吧
微博 QQ QQ空间 贴吧
视频嵌入链接文档嵌入链接
<iframe src="https://www.slidestalk.com/AnalyticsZoo/RayOnSpark_at_Burgerking?embed&video" frame border="0" width="640" height="360" scrolling="no" allowfullscreen="true">复制
微信扫一扫分享
已成功复制到剪贴板

Analytics Zoo 社区

发布于

4年前

941

人观看

#信息技术

在快餐推荐的场景下，用户实时的点餐行为和各种情景特征（比如时间、天气和位置等）都是能够被用来做合适推荐的重要因素。在Burger King，我们开发了一个全新的Transformer Cross Transformer (TxT)推荐模型，用多个 Transformer编码器来提取用户点单行为和复杂的情景特征，并通过点积的方法将Transformer的输出组合在一起以生成推荐。线上A/B测试结果表明TxT模型不仅比现有的其他推荐模型取得了更好的效果，同时该模型也能被成功地应用到其他推荐场景中。

此外，我们利用 Analytics Zoo提供的RayOnSpark功能，使用 Ray, Apache Spark和 Apache MXNet 构建了一个完整的端到端的推荐系统。它将数据处理（使用 Spark ）和分布式训练（使用 MXNet和Ray）集成到一个统一的数据分析和 AI 流水线中，并直接运行在存储数据的同一个大数据集群上。我们已经在 Burger King成功部署了这套推荐系统，并且已经在生产环境中取得了卓越的成果。

展开查看详情

1 . AI on Big Data Distributed, High-Performance Unified Analytics + AI Platform Deep Learning Framework Distributed TensorFlow, Keras and PyTorch on for Apache Spark Apache Spark/Flink & Ray https://github.com/intel-analytics/bigdl https://github.com/intel-analytics/analytics-zoo Accelerating Data Analytics + AI Solutions At Scale

2 .Context-aware Fast Food Recommendation with RayOnSpark at Burger King LUYANG WANG Burger King Corporation KAI HUANG Intel Corporation

3 . LUYANG WANG ▪ Food recommendation use case TxT model in detail Agenda ▪ KAI HUANG ▪ AI on big data ▪ Distributed training pipeline with RayOnSpark

4 .Food Recommendation Use Case

5 . Food Recommendation Use Case Guest arrives ODMB Checks Menu Board Cashier enters order Checks Menu Board Guest completes order

6 . Food Recommendation Use Case Guest arrives ODMB Checks Menu Board Cashier enters order Checks Menu Board Guest completes order

7 .Use Case Challenges Challenges ▪ Lack of user identifiers ▪ Same session food compatibilities ▪ Other variables in our use case: locations, weathers, time, etc. ▪ Deployment challenges

8 .Use Case Challenges Solutions ▪ Session based recommendation model ▪ Able to take complex context features into consideration ▪ Able to be deployed anywhere, both edge / cloud

9 .Transformer Cross Transformer (TxT)

10 .TxT Model Overview Model Components ▪ Sequence Transformer Taking item order sequence as input ▪ Context Transformer Taking multiple context features as input ▪ Latent Cross Joint Training Element-wise product for both transformer outputs

11 . Model Comparison TxT RNN Latent Cross

12 . Offline Evaluation Offline Training Loss Offline Training Result Model Top1 Accuracy Top3 Accuracy RNN 29.98% 46.24% Contextual 32.18% 48.37% ItemCF RNN Latent Cross 33.10% 49.98% TxT 34.52% 52.37%

13 . Online Performance Inference Performance A/B Testing Result Inference Latency (ms) 25 Model Conversation Rate Add-on Sales Gain Gain 20 20 18 15 RNN Latent Cross - - (control) 10 5 TxT +7.5% +4.7% 0 RNN Latent Cross TxT Inference Latency (ms)

14 . Model Training Architecture Previous Current

15 .AI on Big Data

16 . AI on Big Data Accelerating Data Analytics + AI Solutions At Scale ▪ BigDL: Distributed Deep Learning Framework for Apache Spark https://github.com/intel-analytics/BigDL ▪ Analytics Zoo: Distributed TensorFlow, Keras and PyTorch on Apache Spark/Flink & Ray https://github.com/intel-analytics/analytics-zoo ▪ We develop Project Orca in Analytics Zoo based on Spark and Ray to allow users to easily scale out single node Python notebook across large clusters, by providing: ▪ Data-parallel preprocessing for Python AI (supporting common Python libraries such as Pandas, Numpy, PIL, TensorFlow Dataset, PyTorch DataLoader, etc.) ▪ Sklearn-style APIs for transparently distributed training and inference (supporting TensorFlow, PyTorch, Keras, MXNet, Horovod, etc.) https://github.com/intel-analytics/analytics-zoo/tree/master/pyzoo/zoo/orca

17 . Ray Ray is a fast and simple framework for building and running distributed applications. ▪ Ray Core provides easy Python interface for parallelism by using remote functions and actors. Ray is packaged with several high-level libraries to accelerate machine learning workloads. ▪ Tune: Scalable Experiment Execution and Hyperparameter Tuning ▪ RLlib: Scalable Reinforcement Learning ▪ RaySGD: Distributed Training Wrappers ▪ https://github.com/ray-project/ray/

18 .Distributed Training Pipeline on Big Data

19 . RayOnSpark Seamlessly integrate Ray applications into Spark data processing pipelines. ▪ Runtime cluster environment preparation. ▪ Create a SparkContext on the drive node and use Spark to perform data cleaning, ETL, and preprocessing tasks. ▪ RayContext on Spark driver launches Ray across the cluster. ▪ Similar to RaySGD, we implement a lightweight shim layer around native MXNet modules for easy deployment on YARN cluster. ▪ Each MXNet worker takes the local data partition of Spark RDD or DataFrame from the plasma object store used by Ray.

20 .End-to-end Distributed Training Pipeline Project Orca provides a user-friendly interface for the pipeline. ▪ Minimum code changes and learning efforts are needed to scale the training from single node to big data clusters. ▪ The entire pipeline runs on a single cluster. No extra data transfer needed. from zoo.orca import init_orca_context from zoo.orca.learn.mxnet import Estimator # init_orca_context unifies SparkContext and RayContext sc = init_orca_context(cluster_mode="yarn", num_nodes, cores, memory) # Use sc to load data and do data preprocessing. mxnet_estimator = Estimator(train_config, model=txt, loss=SoftmaxCrossEntropyLoss(), metrics=[mx.metric.Accuracy(), mx.metric.TopKAccuracy(3)]) mxnet_estimator.fit(data=train_rdd, validation_data=val_rdd, epochs=…, batch_size=…)

21 . Conclusion ▪ Context-Aware Fast Food Recommendation at Burger King with RayOnSpark https://arxiv.org/abs/2010.06197 https://medium.com/riselab/context-aware-fast-food-recommendation-at-burger-king- with-rayonspark-2e7a6009dd2d ▪ For more details of RayOnSpark: https://www.slidestalk.com/w/217 ▪ More information for Analytics Zoo at: https://github.com/intel-analytics/analytics-zoo https://analytics-zoo.github.io/

22 . Unified Analytics + AI Platform Distributed TensorFlow, Keras and PyTorch on Apache Spark/Flink & Ray https://github.com/intel-analytics/analytics-zoo

2点赞

0收藏

2下载