申请试用
HOT
登录
注册
 
Cosco: An Efficient Facebook-Scale Shuffle Service

Cosco: An Efficient Facebook-Scale Shuffle Service

Spark开源社区
/
发布于
/
9526
人观看
Cosco is an efficient shuffle-as-a-service that powers Spark (and Hive) jobs at Facebook warehouse scale. It is implemented as a scalable, reliable and maintainable distributed system. Cosco is based on the idea of partial in-memory aggregation across a shared pool of distributed memory. This provides vastly improved efficiency in disk usage compared to Spark’s built-in shuffle. Long term, we believe the Cosco architecture will be key to efficiently supporting jobs at ever larger scale. In this talk we’ll take a deep dive into the Cosco architecture and describe how it’s deployed at Facebook. We will then describe how it’s integrated to run shuffle for Spark, and contrast it with Spark’s built-in sort-based shuffle mechanism and SOS (presented at Spark+AI Summit 2018).
0点赞
1收藏
48下载
确认
3秒后跳转登录页面
去登陆