申请试用
HOT
登录
注册
 
Data Ingestion for the Connected World

Data Ingestion for the Connected World

陈重丶
/
发布于
/
1812
人观看
In this paper, we argue that in many “Big Data” applications, getting data into the system correctly and at scale via traditional ETL (Extract, Transform, and Load) processes is a fundamental roadblock to being able to perform timely analytics or make real-time decisions. The best way to address this problem is to build a new architecture for ETL which takes advantage of the push-based nature of a stream processing system. We discuss the requirements for a streaming ETL engine and describe a generic architecture which satisfies those requirements. We also describe our implementation of streaming ETL using a scalable messaging system(Apache Kafka), a transactional stream processing system(S-Store), and a distributed poly store (Intel’s Big DAWG)
3 点赞
1 收藏
0下载
相关文档
确认
3秒后跳转登录页面
去登陆