申请试用
HOT
登录
注册
 
BlinkDB: Queries with Bounded Errors and Bounded Response Times

BlinkDB: Queries with Bounded Errors and Bounded Response Times

da仔
/
发布于
/
1924
人观看
BlinkDB uses two key ideas: (1) an adaptive optimization framework that builds and maintains a set of multi-dimensional stratied samples from original data over time. (2) a dynamic sample selection strategy that selects an appropriately sized sample based on a query’s accuracy or response time requirements. We evaluate BlinkDB against the well-known TPC-H benchmarks and a real-world analytic workload derived from Conviva Inc., a company that manages video distribution over the Internet. Our experiments on a 100 node cluster show that BlinkDB can answer queries on up to 17 TBs of data in less than 2 seconds (over 200× faster than Hive), within an error of 2-10%.
8 点赞
3 收藏
0下载
相关文档
确认
3秒后跳转登录页面
去登陆