申请试用
HOT
登录
注册
 
ROCm and Distributed Deep Learning on Spark and TensorFlow

ROCm and Distributed Deep Learning on Spark and TensorFlow

Spark开源社区
/
发布于
/
8564
人观看
ROCm, the Radeon Open Ecosystem, is an open-source software foundation for GPU computing on Linux. ROCm supports TensorFlow and PyTorch using MIOpen, a library of highly optimized GPU routines for deep learning. In this talk, we describe how Apache Spark is a key enabling platform for distributed deep learning on ROCm, as it enables different deep learning frameworks to be embedded in Spark workflows in a secure end-to-end machine learning pipeline. We will analyse the different frameworks for integrating Spark with Tensorflow on ROCm, from Horovod to HopsML to Databrick’s Project Hydrogen. We will also examine the surprising places where bottlenecks can surface when training models (everything from object stores to the Data Scientists themselves), and we will investigate ways to get around these bottlenecks. The talk will include a live demonstration of training and inference for a Tensorflow application embedded in a Spark pipeline written in a Jupyter notebook on Hopsworks with ROCm.
0点赞
1收藏
1下载
确认
3秒后跳转登录页面
去登陆