申请试用
HOT
登录
注册
 
A Primitive Operator for Similarity Joins in Data Cleaning

A Primitive Operator for Similarity Joins in Data Cleaning

陈重丶
/
发布于
/
1849
人观看
Data cleaning based on similarities involves identification of “close” tuples, where closeness is evaluated usingavariety of similarity functions chosen to suit the domain and application. Current approaches for efficiently implementing such similarity joins are tightly tied to the chosen similarity function. In this paper, we propose a new primitive operator which can be used as a foundation to implement similarity joins according to a variety of popular string similarity functions, and notions of similarity which go beyond textual similarity. We then propose efficient implementations for this operator. In an experimental evaluation using real data sets,
12 点赞
3 收藏
0下载
相关文档
确认
3秒后跳转登录页面
去登陆