MLflow and Azure Machine Learning—The Power Couple for ML Lifecycle Management

The ML Lifecycle management process is quickly becoming the bottleneck for a lot of ML projects. With MLflow’s newest release, and its enhanced integration with Azure Machine Learning, this process is now showing the right promise and capabilities on Azure. In this talk, we intend to take a tour of the integration details and how MLOps is now becoming a strength of the platform. We’ll talk about versioning, maintaining run history, production pipeline automation, deployment to cloud and edge, and CI/CD pipelines with MLOps as the backdrop.

Be prepared for an interactive conversation as we intend to seek a lot of feedback on the integration and capabilities being lit up.


1.WIFI SSID:Spark+AISummit | Password: UnifiedDataAnalytics

2.mlFlow and Azure Machine Learning The Power Couple for ML Lifecycle Management Nishant Thacker, Microsoft #UnifiedDataAnalytics #SparkAISummit

3.Milestones Managed Spark by Managed MLflow on Databricks on Azure Databricks Azure Mar ‘18 Apr ‘19 Dec ‘18 Azure ML Launched

4. Azure Databricks Fast, easy, and collaborative Apache Spark™-based analytics platform Increase productivity Built with your needs in mind Role-based access controls Effortless autoscaling Build on a secure, trusted cloud Live collaboration Enterprise-grade SLAs Best-in-class notebooks Scale without limits Simple job scheduling Seamlessly integrated with the Azure Portfolio

5.Azure Machine Learning service Bring AI to everyone with an end-to-end, scalable, trusted platform Built with your needs in mind Boost your data science productivity Automated machine learning Managed compute Increase your rate of experimentation DevOps for machine learning Simple deployment Tool agnostic Python SDK Deploy and manage your models anywhere Support for open source frameworks Seamlessly integrated with the Azure Portfolio

6.Large retail customer: Use case + Persona Customer Ask: We need to build a unified platform to support a large globally diverse team of Data Engineers, Data Scientists and AI Developers for their big data, deep learning projects. These projects will help us predict and reduce churn, increase retention, and grow revenue Data Scientists want to use ML Engineers want to Data Engineers prefer to stay open source frameworks like integrate the ML models in to in Spark for distributed data PyTorch & TensorFlow with applications via a scalable web processing on PB scale data. GPU and CPU for training. service. They do not want to manage They are familiar with MLflow They do not want to manage the infra for data preparation for managing ML lifecycle. the infrastructure. Recommendation (preferred by the customer): Use Azure Databricks for data prep Use Azure ML with MLflow on Azure Databricks for training OR Use Azure ML with MLflow in Notebook VM with remote Azure ML compute for training Use Azure ML for Model management and MLOps

7.Demo summary – MLflow with Azure ML Experimentation Experiments and Experiments Metrics Tracking Experiments and Metrics Logging Local machine Azure Machine Learning Workspace Virtual machine Azure ML Compute Azure Databricks Metric Artifacts s Logging API Tracking URI

8.Demo summary – MLflow with Azure ML Deployment Models Model Management Model Deployment Azure Machine PyTorch Learning Workspace TensorFlow Scikit-Learn ONNX … Model Artifacts s Deploy API

9.How to get started • PyPi package: azureml- Install mlflow • Set Azure ML workspace • Set MLflow tracking URI to Set Azure ML • Run your MLflow experiment • Track your results in Azure ML Go • Deploy trained model to Azure

10.TBD #UnifiedDataAnalytics #SparkAISummit 10


由Apache Spark PMC & Committers发起。致力于发布与传播Apache Spark + AI技术,生态,最佳实践,前沿信息。