在无人驾驶领域,高度抽象出传感器数据收集(Data Logger)- 模型工程(Data Center) - 自动驾驶(Control Unit);传感器通常都能以惊人的速度产生输入(以Udacity Lincoln MKZ为例,高达3GB/分钟),实时数据上传到数据中心进行模型工程运算,以神经网络为核心的模型训练和模拟仿真+大量的传感器数据处理,让数据中心的计算资源配套不堪重负,如何提高数据中心的计算效率,以Apache Spark/TensorFlow/RoS为代表的开放机器学习平台正在不断完善。

注脚

展开查看详情

1.Machine Learning for Self-Driving Cars

2.High-level Development Process for Autonomous Vehicles Agenda 1 Collect 2 Model 3 Autonomous sensors data Big Data Engineering Trained Model Driving Data Logger Data Center Control Unit

3.High-level Development Process for Autonomous Vehicles 1 Collect sensors data 1 Collect 2 Model 3 Autonomous sensors data Big Data Engineering Trained Model Driving Data Logger Data Center Control Unit 3

4.Sensors Udacity Lincoln MKZ Camera 3x Blackfly GigE Camera, 20 Hz Lidar Velodyne HDL-32E, 9.5 Hz IMU Xsens, 400 Hz GPS 2x fixed, 1 Hz CAN bus, 1,1 kHz Robot Operating System Data 3 GB per minute https://github.com/udacity/self-driving-car

5.Robot Operating System + Popular open source robotics framework + Reliable distributed architecture + Wide use in the robotics research community + Huge selection of “off-the-shelf” software packages for hardware/algorithms/etc. + Used by Bosch, BMW, KUKA, Google, Siemens, etc. https://roscon.ros.org/2015/presentations/ROSCon-Automated-Driving.pdf

6.Sensors Spec Sensor blinding, rain, fog, non-metal wind/ high resolution range data sunlight, snow objects velocity darkness Ultrasonic yes yes yes no + + + Lidar yes no yes yes +++ ++ + Radar yes yes no yes ++ +++ + Camera no no yes yes +++ +++ +++

7.Real-time Analysis of car data Realtime Data Analytics Receive signals, analysis, and machine learning Data Center Real-time or batch analysis based on sensors data realtime publish/subscribe Pre-select signals, aggregate and prepare for sending Data Logger Parse traces and signals (dbc, fibex, autosar...) Car data from sensors and bus traces Car Layer CAN, Flexray, Camera, Radar, Lidar, IMU, etc.

8.High-level Development Process for Autonomous Vehicles 2 Model Engineering 1 Collect 2 Model 3 Autonomous sensors data Big Data Engineering Trained Model Driving Data Logger Data Center Control Unit 8

9.Machine Learning in Robotics State Modeling & Observations Planning Estimation Prediction Controls f(x) Observations Controls

10.Machine Learning for Autonomous Driving + Sensor Fusion clustering, segmentation, pattern recognition + Road ego-motion, image processing and pattern recognition + Localization simultaneous localization and mapping + Situation Understanding detection and classification + Trajectory Planning motion planning and control + Control Strategy reinforcement and supervised learning + Driver Model image processing and pattern recognition

11.Machine Learning Workflow Model Feedback Loop Ingest data Data Reports Preprocessing Results Train Test Loop Search Training Model Model Model Analysis data Training Testing Deployment Test Re- data simulation

12.More Data + Bigger Models Accuracy 1990s neural networks other approaches Scale (data size, model size) https://www.scribd.com/document/355752799/Jeff-Dean-s-Lecture-for-YC-AI

13.More Data + Bigger Models + More Computation Accuracy Now more compute neural networks other approaches Scale (data size, model size) https://www.scribd.com/document/355752799/Jeff-Dean-s-Lecture-for-YC-AI

14.Train and evaluate machine learning models at scale Single machine Data center How to run more experiments faster and in parallel? How to share and reproduce research? How to go from research to real products?

15.When to use Distributed Machine Learning Model parallelism training very large models exploring several model Model Size Data center architectures, hyper- parameter optimization, training several independent models Single machine Data parallelism speeds up the training Data Size

16.Compute Workload for Training and Evaluation Compute Data center intensive Single machine I/O intensive

17.I/O Workload for Simulation and Testing Compute Data center intensive Single machine I/O intensive

18.Open Machine Learning Platform ML Development & Catalog & REST API ü Mainly open source ü No vendor lock in Sample Model Prediction Batch Regression Cluster ü Scale-out architecture Dataset Correlation Centroid Anomaly Test Scores ü Multi user support Search Training Re-Simulation ü Resource management ML-Specialists Analysis Evaluation Testing ü Job scheduling ü Speed-up training ü Speed-up simulation CaffeOnSpark Compute + Network + Storage Training & Test data Deploy model

19.ROS bag data structure https://github.com/valtech/ros_hadoop

20.Hadoop InputFormat for ROS bags https://github.com/valtech/ros_hadoop

21.Search & Analysis NumPy Advanced Ros Analytics Msg + Hadoop InputFormat and Record Reader for Rosbag DataFrame, DataSet + Process Rosbag with Spark, RDD SQL, Spark APIs Yarn, MapReduce, Hadoop Streaming API, … Processing Engine + Spark RDD are cached and optimized for analysis Record RDD Reader Computer Ros Network bag Storage

22.Training & Evaluation Machine Learning + Tensorflow Record Reader + Protocol Buffers to serialize Ros records msg + Save time because data Training conversion not needed Engine + Save storage because data duplication not needed Record Reader Computer Ros Network bag Storage

23.Re-Simulation & Testing Re-Simulation with framework of choice + Use Spark for preprocessing, transformation, cleansing, subscribe aggregation, time window Ros selection before publish to ROS publish topic topics Engine + Use Re-Simulation framework core of choice to subscribe to the ROS topics Computer Ros Network bag Storage

24. reduce/ Time Travel shuffle fold(right) fold(left) t

25.High-level Development Process for Autonomous Vehicles 3 Autonomous Driving 1 Collect 2 Model 3 Autonomous sensors data Big Data Engineering Trained Model Driving Data Logger Data Center Control Unit 25

26.Architecture Building Blocks http://www.bmw-carit.com/downloads/presentations/AutonomousDrivingNeedsROSScript.pdf

27. Hadoop InputFormat for ROS Apache License 2.0 Download https://github.com/valtech/ros_hadoop Contact jan.wiegelmann@valtech.de

28.thank you