- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
基于 Apache Spark的现代硬件无机器学习
展开查看详情
1 .Serverless Machine Learning on Modern Hardware Patrick Stuedi, Michael Kaufmann, Adrian Schuepbach IBM Research #Res6SAIS
2 .Serverless Computing ● No need to setup/manage a cluster ● Automatic, dynamic and fine- grained scaling ● Sub-second billing ● Many frameworks: AWS Lambda, Google Cloud Functions, Azure Functions, Databricks Serverless, etc.
3 . Challenge: Performance Example: Sorting 100GB 500 Runtime [seconds] 400 300 200 100 0 AWSAWS Lambda λ Spark Serverless Spark Databricks Cloud Databricks On-Premise Spark On-Premise Spark Serverless Standard On-premise On-premise++ Increasing flexibility Increasing performance Spark/On-Premise++: Running Apache Spark on a High-Performance Cluster using RDMA and NVMe Flash, Spark Summit’17
4 . Challenge: Performance Example: Sorting 100GB 64 lambda 500 workers Runtime [seconds] 400 300 200 100 0 AWSAWS Lambda λ Spark Serverless Spark Databricks Cloud Databricks On-Premise Spark On-Premise Spark Serverless Standard On-premise On-premise++ Increasing flexibility Increasing performance Spark/On-Premise++: Running Apache Spark on a High-Performance Cluster using RDMA and NVMe Flash, Spark Summit’17
5 . Challenge: Performance Example: serverless cluster Sorting 100GB with autoscaling min machines: 1 64 lambda 500 max machines: 8 workers Runtime [seconds] 400 300 200 100 0 AWSAWS Lambda λ Spark Serverless Spark Databricks Cloud Databricks On-Premise Spark On-Premise Spark Serverless Standard On-premise On-premise++ Increasing flexibility Increasing performance Spark/On-Premise++: Running Apache Spark on a High-Performance Cluster using RDMA and NVMe Flash, Spark Summit’17
6 . Challenge: Performance Example: serverless cluster Sorting 100GB with autoscaling min machines: 1 64 lambda 500 max machines: 8 workers Runtime [seconds] standard cluster 400 no autoscaling 300 8 machines 200 100 0 AWSAWS Lambda λ Spark Serverless Spark Databricks Cloud Databricks On-Premise Spark On-Premise Spark Serverless Standard On-premise On-premise++ Increasing flexibility Increasing performance Spark/On-Premise++: Running Apache Spark on a High-Performance Cluster using RDMA and NVMe Flash, Spark Summit’17
7 . Challenge: Performance Example: serverless cluster Sorting 100GB with autoscaling min machines: 1 64 lambda 500 max machines: 8 workers Runtime [seconds] standard cluster 400 no autoscaling 300 8 machines 200 100Gb/s Ethernet 100 0 AWSAWS Lambda λ Spark Serverless Spark Databricks Cloud Databricks On-Premise Spark On-Premise Spark Serverless Standard On-premise On-premise++ Increasing flexibility Increasing performance Spark/On-Premise++: Running Apache Spark on a High-Performance Cluster using RDMA and NVMe Flash, Spark Summit’17
8 . Challenge: Performance Example: serverless cluster Sorting 100GB with autoscaling min machines: 1 64 lambda 500 max machines: 8 workers Runtime [seconds] standard cluster 400 no autoscaling 300 8 machines 200 100Gb/s Ethernet 100 RDMA, NVMe flash, 0 AWSAWS Lambda NVMeF λ Spark Serverless Spark Databricks Cloud Databricks On-Premise Spark On-Premise Spark Serverless Standard On-premise On-premise++ Increasing flexibility Increasing performance Spark/On-Premise++: Running Apache Spark on a High-Performance Cluster using RDMA and NVMe Flash, Spark Summit’17
9 .Why is it so hard? ● Scheduler: when to best add/remove resources? put your #assignedhashtag here by setting the footer in view-header/footer ● Container startup: may have to dynamically spin up containers ● Storage: input data needs to be fetched from remote storage (e.g., S3) – As opposed to compute-local storage such as HDFS ● Data sharing: intermediate needs to be temporarily stored on remote storage (S3, Redis) – Affects operations like shuffle, broadcast, etc.,
10 .Why is it so hard? ● Scheduler: when to best add/remove resources? put your #assignedhashtag here by setting the footer in view-header/footer ● Container startup: may have to dynamically spin up containers ● Storage: input data needs to be fetched from remote storage (e.g., S3) – As opposed to compute-local storage such as HDFS ● Data sharing: intermediate needs to be temporarily stored on remote storage (S3, Redis) – Affects operations like shuffle, broadcast, etc.,
11 .Example: MapReduce (Cluster) Compute & Store Nodes Map Stage data is mostly Compute & written and Store Nodes read locally Reduce Stage Compute & Store Nodes
12 . Example: MapReduce (Serverless) Dynamically growing/shrinking data is compute cloud exclusively Map written and Stage read remotely Shuffle Reduce Stage Storage Service (e.g, S3, Redis)
13 . I/O Overhead: Sorting 100GB 500 Shuffle I/O Runtime [seconds] Compute 60 400 Input/Output 50 300 60% 32% 40 200 30 18% 49% 100 20 22% 38% 0 10 3% AWS AWS λ Lambda Databricks Spark ServerlessDatabricks Spark Cloud Spark Spark Cluster SparkHPC Spark 19% 59% Serverless Standard On-premise On-premise++ 0 Spark Spark Cluster SparkHPC Spark On-premise On-premise++ Shuffle overheads are significantly higher when intermediate data is stored remotely
14 .What about other workloads? Example: SQL, Query 77 / TPC-DS benchmark
15 .What about other workloads? Example: SQL, Query 77 / TPC-DS benchmark Shuffle/Broadcast (needs to be stored remotely)
16 . What about other workloads? Example: Iterative ML (e.g., linear regression) could be co-located with worker nodes W W PS W W *) read training data *) fetch model params *) compute *) update model *) use cached data *) fetch model params *) compute *) update model
17 . What about other workloads? Example: Iterative ML (e.g., linear regression) could be co-located with worker nodes W W PS W W *) read training data *) fetch model params *) compute *) update model *) use cached data *) fetch model params *) compute *) update model
18 . What about other workloads? Needs to be Example: Iterative ML (e.g., linear regression) could be co-located with worker nodes remote W W PS W W W scale *) read training data *) fetch model params out *) compute *) update model read remote data *) use cached data *) fetch model params *) compute *) update model
19 . What about other workloads? Needs to be Example: Iterative ML (e.g., linear regression) could be co-located with worker nodes remote W W PS W W W scale *) read training data *) fetch model params out *) compute *) update model read remote data *) use cached data *) fetch model params *) compute *) update model barrier, need to wait
20 .Can we.. ● ..use Spark to run such workloads in a serverless fashion? – Dynamic scaling of compute nodes while jobs are running – No cluster configuration – No startup time overhead ● ..eliminate the performance overheads? – Workloads should run as fast as on a dedicated cluster
21 .Design Options Scheduling: put your #assignedhashtag here by setting the footer in view-header/footer ● –1 Use serverless framework to schedule executors –2 Use serverless framework to schedule tasks –3 Enable sharing of executors among different applications ● Intermediate data: –1 Executors cooperate with scheduler to flush data remotely –2 Consequently store all intermediate state remotely
22 .Design Options High startup Scheduling: put your #assignedhashtag here by setting the footer in view-header/footer ● Latency! –1 Use serverless framework to schedule executors –2 Use serverless framework to schedule tasks –3 Enable sharing of executors among different applications ● Intermediate data: –1 Executors cooperate with scheduler to flush data remotely –2 Consequently store all intermediate state remotely
23 .Design Options High startup Scheduling: put your #assignedhashtag here by setting the footer in view-header/footer ● Latency! –1 Use serverless framework to schedule executors Slow! –2 Use serverless framework to schedule tasks –3 Enable sharing of executors among different applications ● Intermediate data: –1 Executors cooperate with scheduler to flush data remotely –2 Consequently store all intermediate state remotely
24 .Design Options High startup Scheduling: put your #assignedhashtag here by setting the footer in view-header/footer ● Latency! –1 Use serverless framework to schedule executors Slow! –2 Use serverless framework to schedule tasks –3 Enable sharing of executors among different applications ● Intermediate data: –1 Executors cooperate with scheduler to flush data remotely –2 Consequently store all intermediate state remotely
25 .Design Options High startup Scheduling: put your #assignedhashtag here by setting the footer in view-header/footer ● Latency! –1 Use serverless framework to schedule executors Slow! –2 Use serverless framework to schedule tasks –3 Enable sharing of executors among different applications ● Intermediate data: Complex! –1 Executors cooperate with scheduler to flush data remotely –2 Consequently store all intermediate state remotely
26 .Design Options High startup Scheduling: put your #assignedhashtag here by setting the footer in view-header/footer ● Latency! –1 Use serverless framework to schedule executors Slow! –2 Use serverless framework to schedule tasks –3 Enable sharing of executors among different applications ● Intermediate data: Complex! –1 Executors cooperate with scheduler to flush data remotely –2 Consequently store all intermediate state remotely
27 .Architecture Overview send schedule register HCS/ Driver send job DAG Scheduler Intermediate data launch assign task application register register Metadata Storage Intermediate Executor server server data crail.apache.org
28 .Architecture Overview send schedule register HCS/ Driver send job DAG Scheduler Intermediate data launch assign task application register register Metadata Storage Intermediate Executor server server data crail.apache.org
29 .Architecture Overview send schedule register HCS/ Driver send job DAG Scheduler Intermediate data launch assign task application register register Metadata Storage Intermediate Executor server server data crail.apache.org