深度学习云交互

下载 0

快召唤伙伴们来围观吧
微博 QQ QQ空间 贴吧
文档嵌入链接
<iframe src="https://www.slidestalk.com/u180/Interactive_Deep_Learningin_Cloudvia_MML_Spark_?embed" frame border="0" width="640" height="360" scrolling="no" allowfullscreen="true">复制
微信扫一扫分享
已成功复制到剪贴板

poppy

发布于

6年前

1969

人观看

#信息技术

在本演示中，我们表明，互动的环境和特定的深蓝色，学习模型的训练数据与真实的世界。这个环境包括t GPU集群和一个或更多的火花将一个VMS Azure虚拟网络连通在一起，可以轻松地设置一个mmlspark（微软机器学习在Apache放电）在开源的机器学习库的工作流程。

展开查看详情

1 .Interactive Deep Learning in Cloud via MMLSpark Tong Wen, Microsoft #DL3SAIS

2 .Overview • Toward a single environment for fast experimentation with big data and big compute • Spark + Accelerators (GPU, FPGA, TPU, …) + MPI • High performance with: – Cost effectiveness – Ease of use – Extensibility and openness #DL3SAIS 2

3 .MMLSpark https://github.com/Azure/mmlspark/ • Tong Wen @microsoft.com • Eduardo de Leon • Akshaya Annavajhala • Ilya Matiach • Roope Astala • Miruna Oprescu • Eli Barzilay • Young Park • Maureen Busch • Sudarshan Raghunathan • Mark Hamilton • Ratan Sur • Danil Kirsanov #DL3SAIS 3

4 .Key Advantages • Fast experimentation with Deep Learning – GPU vs CPU: ~40x speedup – Single interactive environment with easy setup • Trained an accurate model on NIH chest X-ray dataset in days – Data size: 45 GB compressed on disk; O(1) TB in memory – Model size: 46 million parameters • Cost to train the above model: < $9.54 – Spark cluster (10 nodes) : $2.48/hour – 4 GPUs: $2.29/hour – Training time: 54 mins #DL3SAIS 4

5 .Implementation #DL3SAIS 5

6 .Setup the System https://github.com/Azure/mmlspark/blob/master/ docs/gpu-setup.md #DL3SAIS 6

7 .Attach a New VM Set up passwordless SSH login to the GPU VM Peak FLOPS/s GPU Type Price (FP32) Tesla K80 8.7 teraflops $0.574/hour Tesla P40 12 teraflops 1.319/hour Tesla P100 10.6 teraflops 1.319/hour Tesla V100 15.7 teraflops $1.95/hour Earth Simulator 41 teraflops >> $832/hour (2003) #DL3SAIS 7

8 .Programming API Minibatch Wall clock GPU Epochs size time Yes 30 32 1m53s No 30 32 73m8s #DL3SAIS 8

9 . Test Case: NIH Chest X-ray Dataset • 112,120 X-ray images (1024 by 1024) • AlexNet with 46 million parameters • 14 pathology labels • Half of the dataset for training • 30,805 unique patients • Downsized to 224 by 224 • Binary model • Data Parallel 1-Bit SGD Configuration Epochs Minibatch size Wall clock time 4 GPUs, 2 VMs 55 512 55m47s 4 GPUs, 4 VMs 55 512 53m40s #DL3SAIS 9

10 .Conclusion & Future Work • A dynamically configurable hybrid architecture to support more big data + big compute scenarios with cost effectiveness – Data exchange (Parquet adaptor) – Model exchange (ONNX) – Single environment (Resource management) – Openness (More frameworks) #DL3SAIS 10

11 .Thank You! #DL3SAIS 11

12 .Test System Configuration Node Type Number Size Price 2.4 GHz Intel Xeon® E5-2673 v3 Spark Cluster Node 10 $0.248/hour processor; 8 cores; 28Gib 1 NVIDIA Tesla K80 GPU; 6 cores; GPU VM 2 $0.574/hour 56Gib 2 NVIDIA Tesla K80 GPU; 12 cores; GPU VM 2 $1.147/hour 112Gib #DL3SAIS 12

3点赞

1收藏

0下载