申请试用
HOT
登录
注册
 
Resilient Distributed Datasets

Resilient Distributed Datasets

da仔
/
发布于
/
4357
人观看
RDDs are motivated by two types of applications that current computing frameworks handle inefficiently: iterative algorithms and interactive data mining tools. In both cases, keeping data in memory can improve performance by an order of magnitude.To achieve fault tolerance efficiently, RDDs provide a restricted form of shared memory, based on coarsegrained transformations rather than fine-grained updates to shared state. However, we show that RDDs are expressive enough to capture a wide class of computations, including recent specialized programming models for iterative jobs, such as Pregel, and new applications that these models do not capture. We have implemented RDDs in a system called Spark, which we evaluate through a variety of user applications and benchmarks.
2 点赞
1 收藏
8下载
相关文档
确认
3秒后跳转登录页面
去登陆