Lessons Learned from Using Spark for Evaluating Road Detection at BMW

Getting cars to drive autonomously is one of the most exciting problems these days. One of the key challenges is making them drive safely, which requires processing large amounts of data. In our talk we would like to focus on only one task of a self-driving car, namely road detection. Road detection is a software component which needs to be safe for being able to keep the car in the current lane. In order to track the progress of such a software component, a well-designed KPI (key performance indicators) evaluation pipeline is required. In this presentation we would like to show you how we incorporate Spark in our pipeline to deal with huge amounts of data and operate under strict scalability constraints for gathering relevant KPIs. Additionally, we would like to mention several lessons learned from using Spark in this environment.

展开查看详情

1.WIFI SSID:Spark+AISummit | Password: UnifiedDataAnalytics

2.Lessons Learned from Using Spark for Evaluating Road Detection @ BMW Autonomous Driving Gheorghe Pucea, BMW Group Jennifer Reinelt, BMW Group #UnifiedDataAnalytics #SparkAISummit

3.BMW AUTONOMOUS DRIVING 3

4.Outline • Evaluation of Lane Detection • Evaluation Pipeline • AI Based Ground Truth • Lessons Learned 4

5.BMW AUTONOMOUS DRIVING Car Setup for Autonomous Driving 5

6.Outline • Evaluation of Lane Detection • Evaluation Pipeline • AI Based Ground Truth • Lessons Learned 6

7.Evaluation of Lane Detection How well does the car detect the lane markings? Real lane markings Detected lane markings At 150m? At 100m? At 50m? At 1m? 7

8.Evaluation of Lane Detection How well does the car detect the lane markings? Key Performance Indicator (KPI) – Lateral Offset Lateral offset improvement commit commit commit commit 70d9c31 c271a01 4e0bcd3 6e3bcd3 150m Functional development time 8

9.Evaluation of Lane Detection Challenges: • Where are the real lane markings? How do Real lane markings we get the ground truth? Detected lane markings • How do we avoid making the same mistakes as the car when looking for real lane At 150m? markings? At 100m? At 50m? • How do we scale this ground truth generation? At 1m? 9

10.Evaluation of Lane Detection How do we get the ground truth? • From manual labels Very accurate Manual Slow Expensive to scale up Bad for Occlusions 10

11.Evaluation of Lane Detection How do we get the ground truth? • From additional sensors Automated Expensive to Fast scale up Accurate 11

12.Evaluation of Lane Detection How do we get the ground truth? • Using sophisticated algorithms in the backend Scalable Lower accuracy Automated Fast Cheap 12

13.Outline • Evaluation of Lane Detection • Evaluation Pipeline • AI Based Ground Truth • Lessons Learned 13

14.Evaluation Pipeline Other Other Ground Truth Other Applications Applications Generation Applications Data Ingestion Ros Converter Reprocessing KPI Calculation Data Collection Ros orc bag InfluxDB Datacenter: > 230 PB capacity and > 1.500 TB raw data/day > 100.000 Cores and >200 GPUs 14

15.Outline • Evaluation of Lane Detection • Evaluation Pipeline • AI Based Ground Truth • Lessons Learned 15

16.AI Based Ground Truth Other Other Ground Truth Applications Other Applications Applications Generation Data Ingestion Ros Converter Reprocessing KPI Calculation Data Collection Ros orc bag InfluxDB Datacenter: > 230 PB capacity and > 1.500 TB raw data/day > 100.000 Cores and >200 GPUs 16

17.AI Based Ground Truth 3D Lidar points clouds Lidar intensity in Deep Neural Semantic 2D bird‘s eye view Network Segmentation Lane Marking No Lane Marking 17

18.Outline • Evaluation of Lane Detection • Evaluation Pipeline • AI Based Ground Truth • Lessons Learned 18

19.Motivation of Lessons Learned Source: https://twitter.com/bigdataborat?lang=en 19

20.Motivation of Lessons Learned Source: https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf 20

21.Lessons Learned – Spark Testing Other Ground Truth Other Applications Other Generation Applications Applications Data Ros Reprocessing KPI Calculation Ingestion Converter Data Collection Ros orc bag InfluxDB Datacenter: > 230 PB capacity and > 1.500 TB raw data/day > 100.000 Cores and >200 GPUs 21

22.Lessons Learned – Spark Testing Typical integration test 22

23.Lessons Learned – Spark Testing Drawback of static ORC‘s commited in the source code 23

24.Lessons Learned – Spark Testing Test data generation Type classes library cats 24

25.Lessons Learned – Spark Testing Using test data generation library for integration tests Scalacheck generators available with Type Classes Cats FlatMap Type Class 25

26.Lessons Learned – Spark Testing Sensor data streams as Scala ADT 26

27.Lessons Learned – Spark Testing Example Typeclass for generating Can Messages 27

28.Lessons Learned – Spark Testing Implemeting cats.FlatMap type class 28

29.Lessons Learned – Testing Advantages of using code instead of static Orc files • Compiler helps with breaking changes • Improves test understandability • Flexible manipulation of data using monadic operations 29