张鑫-云原生智能助力企业数字化与智能化转型

下载 1

海牛不是牛

发布于

4385

人观看

#信息技术

张鑫-云原生智能助力企业数字化与智能化转型

展开查看详情

1 .

2 .第七届全球软件案例研究峰会 AI/ AIOps DevOps AI 2018 11 30 -12 3 | 100+

4 .Market Competition

5 . AI and ML spending: from $12 billion in 2017 to $57.6 billion in 2021 Deloitte Near 0.8 million GPUs in datacenters; # ML Global activities is doubling each year 61% interviewees plan to use ML in 2019; 58% already have ML in use 70% will adopt AI in 2030 GPU market in China is booming: 230% increase, to 3.5 billion RMB in 2017

6 .“Can I tweak the bought models and APIs ” “I don’t want to give out my sensitive data and business ideas”

7 .“Which framework, models, hyperparameters to use?”

8 .“How to speed up my training using really deep network against really huge amount of data?”

9 .“How to allocate our 400 GPUs to 20 model development?”

10 .From Enterprise Almanac 2018 by Work-Bench

11 .

12 .“Ge tech wor is a chal And

13 . Machine Data Collection Resource Data Verification Management Serving Feature ML Analysis Tools Infrastructure Code Extraction Configuration Monitoring Source: Sculley et al.: Hidden Technical Debt in Machine Learning Systems

14 . ,: , : : ? : , ! :? , :? : :? : :: :

15 .

16 .Building a model

17 . Data Data ingestion Data analysis Data validation Data splitting transformation Building Model Training Trainer a model validation at scale Roll-out Serving Monitoring Logging

18 .-

19 .- Experimentation Training Multi-Cloud

20 .https://kubernetes.io/blog/2017/12/introducing-kubeflow-composable/

21 .Kubeflow's Mission Make it Easy for Everyone to Develop, Deploy and Manage Portable, Distributed ML on Kubernetes 21

22 .1. Kubeflow = Cloud Native, multi-cloud solution for ML. 2. Kubeflow provides a platform for composable, portable and scalable ML pipelines. 3. If you have a Kubernetes conformant cluster, you can run Kubeflow.

23 .Experimentation Training Cloud Kubeflow

24 .Goal: Low bar; High ceiling ● Low bar - make it super easy to get started a. Minimize number of K8s concepts users need to learn b. Optimize Kubeflow deployments with scaffolding for apps ● High ceiling - allow sysadmins to do complex customizations a. Extensibility has been critical to K8s success b. Users should be able to easily customize individual components

25 .More Tools and Frameworks TF. Transform Data Data ingestion Data analysis DataTF.Data validation Data splitting transformation Numpy Spark TF Job Building Model Training Trainer aJupyter model validation MXNet at scale PyTorch TF.Serving Roll-out Serving Seldon Prometheus Monitoring Logging TensorRT ...And lots more I can’t fit into this slide

26 .Kubeflow is initially targeting ML Engineers ● Target Persona ML Engineers ○ Responsible for productionizing ML ○ Enterprise concerns: reliability, scalability, security ■ Kubernetes is a natural choice ○ More devops expertise ● Datascientists/Researchers important but secondary at this stage ○ Build models ○ Less devops expertise ○ Kubernetes can be a tougher sell; they might be quite happy with a single machine with lots of resources

27 .3 Areas Of Development 1. Simple deployment of ML components on K8s a. kfctl b. GitOps for ML 2. K8s Native ML components/tools and integrations a. K8s packaging for TF/TFX components & libraries b. Katib c. Jupyter 3. Example Solutions a. Natural Language Code Search 27

28 .Simple Deployment 28

29 .Getting Started Is Difficult Grab bag of components ● Cobble together an ML platform out of 10's components ● ML bits ○ Jupyter ○ TFJob/PyTorch ● K8s bits ○ Networking(Ambassa dor, ISTIO, CertManager) ○ GPU installers ● Cloud Bits ○ K8s cluster 29 ○ Storage

3点赞

1收藏

1下载