申请试用
HOT
登录
注册
 
Shark: SQL and Rich Analytics at Scale

Shark: SQL and Rich Analytics at Scale

da仔
/
发布于
/
1977
人观看
Shark is a new data analysis system that marries query processing with complex analytics on large clusters. It leverages a novel distributed memory abstraction to provide a unified engine that can run SQL queries and sophisticated analytics functions (e.g., iterative machine learning) at scale, and efficiently recovers from failures mid-query. This allows Shark to run SQL queries up to 100× faster than Apache Hive, and machine learning programs up to 100× faster than Hadoop. Unlike previous systems, Shark shows that it is possible to achieve these speedups while retaining a MapReduce-like execution engine, and the fine-grained fault tolerance properties that such engines provide.
3 点赞
0 收藏
0下载
相关文档
确认
3秒后跳转登录页面
去登陆