计算机视觉技术和软件构建

王翼,前新加坡南洋理工大学副研究员/美国密歇根理工大学博士候选人。以3d图像技术为主线,在企业围绕软件解决方案,尤其是无人驾驶软件仿真,高精地图定位和建图的解决方案开展研究和开发工作。著有相关基础软件算法,仿真软件系统,高精地图建图、数据编译等相关专利技术和软件产品,目前主要研究基于前融合识别技术。本次分享将主要通过结合自己从事的专有方向,高精地图,仿真软件,来解读视觉技术和软件构建。

展开查看详情

1.TF v2.2.0 Inference in edge devices and its application in depth estimation Lei Wang (yiak.wy@gmail.com) 1

2. About ME Lei Wang (yiak.wy@gmail.com) Senior Research Software Engineer specialized in Autonomous Vehicle software stack: Simulation and HDMap Previous P.H.D candidate in Michigan Technological University Research Associate in Nanyang Technological School Personal Page: https://yiakwy.github.io WeChat Official Account: 深夜之后

3.Contents Blending TF 2.2.0 in industrial software 1. Motivation 1. Blending deep learning into industrial software : SLAM 2. Plausible methods to obtain and implement an “inference engine” 3. Alternative to TensorRT, why a CPP version of inference engine is important for a real time application (SLAM) ? 2. Implementation of CPP Inference Engine in depth estimation relevant problems 1. Exploration history and various saved formats 2. Multi-threads Asynchronous CPP inference framework 3. Tensor Manipulation: tf::Tensor and Eigen::Tensor (v3.3.9)

4.Contents Blending TF 2.2.0 in industrial software 3. Beyond inference : How TF 2.2.0 contributes to depths estimation 1. TensorFlow as Computing Graph for general optimization problems in the realm of depth estimation: 2. TensorFlow Graphics : Differential Geometry (aka, Taichi similar functionality) 4. Application: POC project SVSO and what we can learn from it 5. Conclusion 6. Reference

5.Contents Blending TF 2.2.0 in industrial software 1. Motivation 1. Blending deep learning into industrial software : SLAM 2. Plausible methods to obtain and implement an “inference engine” 3. Alternative to TensorRT, why a CPP version of inference engine is important for a real time application (SLAM) ? 2. Implementation of CPP Inference Engine in depth estimation relevant problems 1. Exploration history and various saved formats 2. Multi-threads Asynchronous CPP inference framework 3. Tensor Manipulation: tf::Tensor and Eigen::Tensor (v3.3.9)

6. Motivation Blending TF 2.2.0 in industrial software (SLAM)

7.What are Industrial software ? They are typically created to run directly inside end devices : cell phones, low seep robotics, smart vehicles, and machines Semi-dense slam : from TUM official website From internet Most of them are binaries first written with well supported system languages like C/C++ for best communication experiences with hardware resources.

8.What are Industrial software ? Motor C++ IMU Semi-dense slam : from TUM official website Camera Mother board

9.Contents Blending TF 2.2.0 in industrial software 1. Motivation 1. Blending deep learning into industrial software : SLAM 2. Plausible methods to obtain and implement an “inference engine” 3. Alternative to TensorRT, why a CPP version of inference engine is important for a real time application (SLAM) ? 1. Implementation of CPP Inference Engine in depth estimation relevant problems 1. Exploration history and various saved formats 2. Multi-threads Asynchronous CPP inference framework 3. Tensor Manipulation: tf::Tensor and Eigen::Tensor (v3.3.9)

10. Motivation Plausible methods to obtain and implement an “inference engine”

11.1. Export Python Runtime to C++ Bad! python runtime is very expensive to construct and commands are also expensive to be executed. This is true, especially for instruction callings inside a CPP forever true loop.

12.2. Passing Messages Through Communication Deploy subscribers and publishers network with cross language transportation layer

13.2. Passing Messages Through Communication Deploy subscribers and publishers network with cross language transportation layer Implementation details about GRPC PubSub for IPC (I developed in around 2018) can be found in the public speech [1]

14. 2. Passing Messages Through Communication Deploy subscribers and publishers network with cross language transportation layer CPP Inference Core: class RPCInferenceEngine : public Pubsub<Message> Python ML Core: { Broker: Topics, publishers, public: class InferEngine(rpc.Pubsub): subscribers manager ResultType Infer() { … def infer(): Messages msgs; … Msgs[0].attr[‘img’] = img_raw_buf; self._subscriber.Pull(“img”, 1) img = self.get_img() # cv.imdecode raw Protobuffer publisher_->Publish(“img”, img_raw_buf); buf string subscriber_->Pull(“detection”,1); ret = self._model.detect(img) DetectionResult detection = std::move( subscriber_- self._publisher.Publish(“detection”, ret) >channel.send().AsDetectionResult() ); … … }

15.2. Passing Messages Though Communication Alternatively, prediction server can be constructed with http 1.1 protocol. We have plenty of ways to set up a server such as python Django, flask, and other technologies because python is really good at service development. The major problem is lying in transportation layer. That is what GRPC is optimized for.

16.3. C++ Inference Engine Recall that in the earlier days of DNN, for people who are familiar with Caffe – the first generation of deep neural network framework released by Jia Qingyang and his colabrators, we train models in c++ and also do inference in c++. However, this becomes an issue in recent years in that: 1. Training is mainly done in python frontend, while that in c++ is either experimental or limited to powerful users. Tensorflow 1.x and 2.x are vivid examples. 2. Part of operators are not supported by the hardware (ASIC): RPN/ROIPooling, BN, Upsampling (important!) and not easy to debug with trimmed models by third party inference optimizers (important!) 3. Third party solution provider does not support Tensorflow save formats directly : TensorRT (NVIDIA, natively support ONNX, Caffe, UFF … )

17.4. Challenges introduced by TF 2.2.0 1. Eager Execution by default : tf::Tensor ~ numpy with autograd ! 2. I/O Saved formats changed. You cannot export graph with frozen variables (i.e., convert tensor to constants) directly now. 3. Tensorflow.keras is recommended over original Keras Implementation and hooks are not exactly the same (bug easy) 4. Tape api Tensorflow as Computation Graph with general optimization problems! Tensorflow is better than HIPS autograd! Example: Tensorflow Graphics and our customer ICP implementation 5. Compatibility with code base from Tensorflow 1.x is possible

18.Contents Blending TF 2.2.0 in industrial software 1. Motivation 1. Blending deep learning into industrial software : SLAM 2. Plausible methods to obtain and implement an “inference engine” 3. Alternative to TensorRT, why a CPP version of inference engine is important for a real time application (SLAM) ? 2. Implementation of CPP Inference Engine in depth estimation relevant problems 1. Exploration history and various saved formats 2. Multi-threads Asynchronous CPP inference framework 3. Tensor Manipulation: tf::Tensor and Eigen::Tensor (v3.3.9)

19. Motivation Alternative to TensorRT, why a CPP version of inference engine is important for a real time application (SLAM)?

20.TensorRT to blend TF 1.x in C++ codes Python Frozen graph UFF InferBuilder Construct nvuffparser::UffParser Network CPP

21.TensorRT to blend TF 2.x in C++ codes Graph Def TF 1.x API Keras Frozen pb file UFF TF 2.x API Tensorflow Saved format ? Core

22.Contents Blending TF 2.2.0 in industrial software 1. Motivation 1. Blending deep learning into industrial software : SLAM 2. Plausible methods to obtain and implement an “inference engine” 3. Alternative to TensorRT, why a CPP version of inference engine is important for a real time application (SLAM) ? 2. Implementation of CPP Inference Engine in depth estimation relevant problems 1. Short Exploration history and various saved formats 2. Export from python for various of backends (tensorlfow 1.x, Keras …) 3. Multi-threads Asynchronous CPP inference framework 4. Tensor Manipulation: tf::Tensor and Eigen::Tensor (v3.3.9)

23.Implementation of CPP Inference Engine in depth estimation relevant problems Exploration history and various saved formats

24.Short Exporation History Cpp API In 2015, TF C/C++ API was shipped with the first release. Later, C++ API was improved but optimized for new features introduced in C++11/14. Morover, there is no official methods to integrate tensorflow c++ in a CMake project. Many third party solutions was proposed. With release of tensorflow lite and TensorRT, more and more people pay attention to CPP inference. So what happens recently?

25.Short Exporation History Tensorflow Serving format At beginning there is no direct way, we have to grab and pass the active graph from tf In 2017, at a meeting held by GoogleCloud, an engineer session. shows how to convert Keras model to Tensorflow serving format with SavedModelBuilder : By converting output node to constants using gfile api, we can dump the graph to protobuffer strings. https://github.com/GoogleCloudPlatform/cloudml- samples/blob/master/census/keras/trainer/model.py API changes and SavedModel is supported as default

26.Tensorflow Serving format 0. SavedModel format and network definition format Here is how both formats look like:

27.Tensorflow Serving format 0. SavedModel format and network definition format Producing with the old method from Tf 1.x (mrcnn_tmp): only protobuf file to describe the network and associate stored values. Producing with the method with SavedModel format used for serving: includeing both variable values, their vocabularies index and network definition and is possible both for Tf 1.x and Tf 2.x. and SavedModel format becomes the default description format in Tf 2.x. Different from plain protobuf network definition, in SavedModel format, “saved_model.pb” contains serialized tensorflow program definition.

28.Tensorflow Serving format 0. SavedModel format and network definition format With TensorFlow built from source, we could use ‘saved_model_cli’ to check the exported model: (https://github.com/yiakwy/SEMANTIC_VISUAL_SUPPORTED_ODEMETRY/blob/master/p ython/pysvso/models/sfe.py):

29.the Old Method (TF 1.x) work flow and construction steps Graph Def TF 1.x API Keras Frozen pb file UFF TF 2.x API Tensorflow Saved format ? Core

Google Developer Groups 谷歌开发者社区,是谷歌开发者部门发起的全球项目,面向对 Google 和开源技术感兴趣的人群而存在的公益性开发者社区。