- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 视频嵌入链接 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
Analytics Zoo上的分布式TensorFlow训练AI玩FIFA足球游戏
近年来,由于对通用人工智能研究的潜在价值,训练AI玩游戏一直是一个火热的研究领域。FIFA实时视频游戏场景复杂,需要结合图像,强化学习等多种不同的AI技术,同时也要求agents响应有实时性,因此是一个非常好的试验场,可以用来探索不同类型的AI技术。本次分享主要介绍我们在训练AI玩FIFA视频游戏方面的一些工作。
展开查看详情
1 .AI
2 . AI Shengsheng Huang, Shan Yu, Jason Dai Collaborations with Shanghai Jiao Tong University
3 . Agenda • Distributed TF on Apache Spark* using Analytics Zoo • RL Platform for Playing FIFA18 • Playing FIFA18 using Imitation Learning & DRL • Experimenting with GRF (Google Research Football*) AI
4 . Agenda • Distributed TF on Apache Spark using Analytics Zoo • RL Platform for Playing FIFA18 • Playing FIFA18 using Imitation Learning & DRL • Experimenting with GRF (Google Research Football) AI
5 . What is Analytics Zoo Distributed, High-Performance Unified Analytics + AI Platform Deep Learning Framework Distributed TensorFlow, Keras, PyTorch and BigDL for Apache Spark on Apache Spark https://github.com/intel-analytics/bigdl https://github.com/intel-analytics/analytics-zoo Accelerating Data Analytics + AI Solutions At Scale AI
6 . Integrated Big Data Analytics and AI Seamless Scaling from Laptop to Production Prototype on laptop Experiment on clusters Production deployment w/ using sample data with history data distributed data pipeline Production Data Pipeline • Easily prototype the end-to-end pipeline • “Zero” code change from laptop to distributed cluster • Directly access production data without data copy • Seamlessly deployed on production big data clusters AI
7 . Analytics Zoo Unified Big Data Analytics and AI Platform Models & Recommendation Time Series Computer Vision NLP Algorithms ML Workflow AutoML for Time Series Automatic Cluster Serving Integrated Distributed TensorFlow & PyTorch on Spark RayOnSpark Analytics & AI Pipelines Spark Dataframes & ML Pipelines for DL Model Serving Library & Distributions (Cloudera/Databricks/….) Distributed Analytics (Spark/Flink/Ray/…) DL Frameworks (TF/PyTorch/…) Python Libraries (Numpy/Pandas/…) Framework https://github.com/intel-analytics/analytics-zoo AI
8 .Distributed Tensorflow on Spark In Analytics Zoo #pyspark code train_rdd = spark.hadoopFile(…).map(…) dataset = TFDataset.from_rdd(train_rdd,…) #tensorflow code import tensorflow as tf slim = tf.contrib.slim images, labels = dataset.tensors with slim.arg_scope(lenet.lenet_arg_scope()): logits, end_points = lenet.lenet(images, …) loss = tf.reduce_mean( \ tf.losses.sparse_softmax_cross_entropy( \ logits=logits, labels=labels)) #distributed training on Spark optimizer = TFOptimizer.from_loss(loss, Adam(…)) optimizer.optimize(end_trigger=MaxEpoch(5)) Write TensorFlow code inline in PySpark program AI
9 . More Information on Analytics Zoo • Project website • https://github.com/intel-analytics/analytics-zoo • https://github.com/intel-analytics/bigdl • Tutorials • CVPR 2018: https://jason-dai.github.io/cvpr2018/ • AAAI 2019: https://jason-dai.github.io/aaai2019/ • “BigDL: A Distributed Deep Learning Framework for Big Data” • In proceedings of ACM Symposium on Cloud Computing 2019 (SOCC’19) • Use cases • Azure, CERN, MasterCard, Office Depot, Tencent, Midea, etc. • https://analytics-zoo.github.io/master/#powered-by/ AI
10 . Agenda • Distributed TF on Apache Spark using Analytics Zoo • RL Platform for Playing FIFA18 • Playing FIFA18 using Imitation Learning & DRL • Experimenting with GRF (Google Research Football) AI
11 . Why FIFA18? What is FIFA18*? • A real-time 3D soccer simulation video game by Electronic Arts* Why FIFA18? • It’s fun ☺ • It’s challenging • Complex (esp. full-court game) and non-deterministic • Large action space (16 basic keys w/ combinations) • Many modes available • Full-court, mini-games, skill games, etc. AI
12 . Shooting Bronze: our experiment environment Shooting is one of the mini-games in FIFA18, Bronze is the easiest level Game mode • Player & goalkeeper 1v1 • Goal: get higher score in 44s Evaluation • Single shoot: score ≤ 200 for miss; 200<score<1200 for goal • Accumulated scores after the game Keyboard control • A/S/W/D: left/right/up/down • Space: shoot AI
13 . Reinforcement Learning Expert State/Action Supervised Demonstrations Pairs Learning Environment State Action Reward Agent Imitation Learning Sequential Decision Making AI
14 . RL Platform For Playing FIFA18 Experiment platform for RL agents and algorithms for FIFA18 Major components • Game info collection & Interpretation • Game Environment Abstraction • Agent Implementation • Imitation learning / supervised learning (SL) • Reinforcement learning (RL) • Hybrid (SL+RL) AI
15 .End-to-end Workflow tfpark: Distributed TensorFlow on Spark AI
16 . Agenda • Distributed TF on Apache Spark using Analytics Zoo • RL Platform for Playing FIFA18 • Playing FIFA18 using Imitation Learning & DRL • Experimenting with GRF (Google Research Football) AI
17 .Training The Agent Using Imitation Learning Classification Loss Movement Network Image Feature Extractor Score Network Regression Loss Score Detector AI
18 .Game Playing (Inference) for Imitation Learning Movement Image Network Feature Extractor Score >threshold? Network AI
19 . Hybrid Approach for training Agent Movement network: trained with Imitation Learning Shoot network: Double DQN Learner DQN loss Actor Gradient wrt loss Q( s, a; ) max a ' Q(s ', a '; ) −−greedy greedy 探索:基于counter的策略 Exploration: counter-based policy ` 1− Exploitation:arg max a Q(s, a; ) 利用: Q network Target Q netwrok environment ( s, a ) s' r save Reward shaping Prioritized experience replay ( s, a, r , s ') buffer AI
20 . Demo https://drive.google.com/file/d/13dBsGOiGbCYOS5TgVAI95Qd-YszAHTW6/view https://drive.google.com/file/d/1JVZjlDSyX8YtUy6qOuGRD_VN4RSZw 0U8/view Human (demonstrator) Imitation Learning (better score than demonstrator) For more complete information about performance and benchmark results, visit www.intel.com/benchmarks. AI
21 .Typical trajectory analysis (Hybrid) Typical Trajectories Movement Policy (SL) Shoot Q-Value (RL) For more complete information about performance and benchmark results, visit www.intel.com/benchmarks. AI
22 . Results Score Goal Ratio Convergence speed beginner 5846.69 50% - Human master 10112.78 92% - demonstrator 7284.98 84.96% - Imitation Learning 10345.18 92.54% - Agent RL (Policy Gradient) 5606.31 40.25% 1069.5 epochs Hybrid 10514.43 95.59% 749.6 epochs For more complete information about performance and benchmark results, visit www.intel.com/benchmarks. AI
23 . Agenda • Distributed TF on Apache Spark using Analytics Zoo • RL Platform for Playing FIFA 2018 • Playing FIFA using Imitation Learning & DRL • Experimenting with GRF (Google Research Football) AI
24 . Google Research Football (GRF) An open source RL environment for playing soccer from Google Brain • https://github.com/google-research/football A great RL environment for playing soccer • More state and reward info & controls • Customizable scenarios, players, rewards and observations, etc. • More useful features such as accelerated speed, self-play, multi-agent, etc. • Easy to dump traces and replay Google Research Football: A Novel Reinforcement Learning Environment (https://arxiv.org/abs/1907.11180) Transfer between FIFA18 and GRF? AI
25 . Early Experiments on GRF https://drive.google.com/file/d/1bNO5rpUhCeCZY9zPGgVCzgUlqH9QF39n/view Trained using PPO in OpenAI* baseline For more complete information about performance and benchmark results, visit www.intel.com/benchmarks. AI
26 . Future Work Ray* support in Analytics Zoo • E.g., RayOnSpark Support for Google Research Football • E.g., transfer between GRF and FIFA? https://medium.com/riselab/rayonspark-running-emerging-ai- applications-on-big-data-clusters-with-ray-and-analytics-zoo- 923e0136ed6a Additional algorithms/models and scenarios • E.g., full-court game AI
27 . Analytics Zoo on Ali E-MR + For more information and support, contact Wesley: Analytics Zoo is already out-of-box on Ali EMR: Email: wesley.du@intel.com DingTalk: * Version upgrade for Analytics Zoo is on-going. AI
28 .
29 .Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations, and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit intel.com/performance. Intel does not control or audit the design or implementation of third-party benchmark data or websites referenced in this document. Intel encourages all of its customers to visit the referenced websites or others where similar performance benchmark data are reported and confirm whether the referenced benchmark data are accurate and reflect performance of systems available for purchase. Optimization notice: Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor- dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software, or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at intel.com/benchmarks. Intel, the Intel logo, Intel Inside, the Intel Inside logo, Intel Atom, Intel Core, Iris, Movidius, Myriad, Intel Nervana, OpenVINO, Intel Optane, Stratix, and Xeon are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. © Intel Corporation AI