申请试用
HOT
登录
注册
 
Entropy Compression of Relations and Compressed Relations

Entropy Compression of Relations and Compressed Relations

陈傲天
/
发布于
/
1791
人观看
We present a method to compress relations close to their entropy while still allowing efficient queries. Column values are encoded into variable length codes to exploit skew in their frequencies. The codes in each tuple are concatenated and the resulting tuplecodes are sorted and delta-coded to exploit the lack of ordering in a relation. Correlation is exploited either by co-coding correlated columns, or by using a sort order that leverages the correlation. We prove that this method leads to near-optimal compression (within 4.3 bits/tuple of entropy), and in practice, we obtain up to a 40 fold compression ratio on vertical partitions tuned for TPC-H queries.
0点赞
0收藏
0下载
确认
3秒后跳转登录页面
去登陆