Paper reading # 2 ｜基于NVM的高性能向量检索方案HM-ANN

下载 5

快召唤伙伴们来围观吧
微博 QQ QQ空间 贴吧
文档嵌入链接
<iframe src="https://www.slidestalk.com/Milvus/Paperreading2NVMHMANN?embed" frame border="0" width="640" height="360" scrolling="no" allowfullscreen="true">复制
微信扫一扫分享
已成功复制到剪贴板

Milvus.io

发布于

3年前

945

人观看

#信息技术

Paper reading 直播是由 Milvus 社区发起的学术直播间，分享业内前沿学术论文，与大家交流人工智能与数据库领域最新研究方向。

本期论文分享，我们邀请到了 Zilliz 实习研究员罗济高，与大家共同讨论一篇被收录于第 34 届神经信息处理系统会议（NeurIPS 2020）的论文，“HM-ANN Efficient Billion-Point Nearest Neighbor Search on Heterogeneous Memory”。

“HM-ANN Efficient Billion-Point Nearest Neighbor Search on Heterogeneous Memory” 是一篇被收录于第 34 届神经信息处理系统会议（NeurIPS 2020）的论文，由微软研究院与加州大学共同撰写。在这篇论文中，作者提出了一种基于图片的相似性搜索算法 HM-ANN。

HM-ANN 同时考虑了内存异质性和数据异质性，并在没有压缩技术的情况下在单机上实现了十亿级的相似性搜索。HM-ANN 实现了低搜索延迟和高搜索精度，尤其是当数据集无法放入 DRAM 时。该算法与最先进的近似最近邻搜索 (ANNS) 解决方案相比具有明显的优势。

罗济高 Zilliz 实习研究员
本科毕业于慕尼黑工业大学，目前是该校数据工程与分析专业研究生在读。兴趣领域包含关系数据库与性能分析，喜欢研究关系数据库內核，对数据库各个子领域的进展拥有高度好奇心。Github账号：https://github.com/cakebytheoceanLu

分享提纲：
1/ HM-ANN 图索引的动机
2/ HM-ANN 图索引的构建和搜索算法
3/ HM-ANN 的价格优势
4/ HM-ANN 性能测试
5/ 技术总结

线上分享结束后，嘉宾还会在直播交流群内实时 QA，欢迎大家添加小助手微信：Zilliz-tech 备注”直播“加入讨论群。

展开查看详情

1 .Paper reading # 2 基于NVM的高性能向量检索方案 HM-ANN: Efficient Billion-Point Nearest Neighbor Search on Heterogenous Memory

2 .Introduction

3 .Speaker bio Jigao Luo Research Intern • Master Student @ TU Munich • Github: https://github.com/cakebytheoceanLuo • Blog: https://cakebytheoceanluo.github.io/

4 .Structure 01 Background 02 HM-ANN 03 Evaluation 04 Summary + Q&A Paper Info Motivation 1B: Build HNSW Build Algorithm 1B: Search HM Search Algorithm 1M: Search Price Comparison

5 .Background

6 .Paper Info Jie Ren, PhD @ University of California, Merced: https://jren73.github.io/ Minjia Zhang, Principal Researcher @ Microsoft Research: https://www.microsoft.com/en-us/research/people/minjiaz/ Dong Li, AP @ University of California, Merced: https://faculty.ucmerced.edu/dong-li/ [NeurIPS 2020]

7 .Intro to HNSW Hierarchical Navigable Small World (HNSW): • similarity graphs • the best-in-class latency-vs-accuracy trade-off • major limitation: memory consuming Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs: https://arxiv.org/abs/1603.09320

8 .Intel® Optane™ Persistent Memory • Durable as SSD (up to 512GiB per DIMM) • Directly addressable by CPU as DRAM https://www.intel.com/content/www/us/en/products/d ocs/memory-storage/optane-persistent- • Two modes memory/optane-dc-persistent-memory-brief.html • Memory mode without persistence • DRAM as L4 cache • App-direct mode as memory-mapped files • Implementation intensive Memory Hierarchy with PMem in https://software.intel.com/content/www/us/en/develo Memory Mode p/articles/intel-optane-persistent-memory-decision- guide.html

9 .Heterogeneous Memory HM := DRAM + PMAM • HM := fast memory + slow memory • DRAM := expensive, fast but small capacity • PMem := relative cheap, relative fast and extremely large • PRAM performs in terms of random access latency • ∼80X times faster than SSD • ∼3X slower than DRAM • HM-ANN works in App-direct Mode in billion-scale dataset search.

10 .HM-ANN

11 .Motivation of HM-ANN • Goal • indices in DRAM for fast query response • raw uncompressed vector for high accuracy • Reality • Graph-based indexes (e.g. HNSW) needs large amount DRAM. • Compression of vectors drops recall & accuracy • => High Recall, Low Latency, Low Memory Cost in billion scale • HM-ANN: a graph-based similarity search algorithm without compression • with DRAM & PMem => HM • enables billion-scale similarity search with high recall and fast runtime on a single node without compression

12 .HM-ANN • HM-ANN takes both hardware and data heterogeneity into consideration. • The key idea: • to build high-quality upper layers to take most memory accesses and provide better navigation for search at L0 • to reduce memory accesses in slow memory.

13 .HM-ANN’s Build

14 .HM-ANN’s Search

15 . Price Comparison • Testing bed of the paper • Intel Xeon Gold 6252 CPU@2.3GHz. • DDR4 (96GB) as fast memory and Optane DC PMM (1.5TB) 128GB Optane 256GB Optane 512GB Optane 4*64GB DIMM DIMM DIMM DDR4 DRAM Price $ 1000 $ 2616 $ 8505 $ 12371 Price / GB 7.8125 10.2031 16.6113 48.32 $ / 1.5TiB --- --- $ 25515 $ 74226 #DIMM / 1.5TiB --- --- 3 24 (Full in a rack) https://buy.hpe.com/us/en/options/enterprise-memory/persistent-memory/persistent-memory/intel-optane-persistent-memory-for-hpe/p/1011306266 https://buy.hpe.com/us/en/options/enterprise-memory/enterprise-ddr4-memory/integrity-superdome-ddr4-memory/hpe-integrity-superdome-ddr4-memory/p/1008537056?q=1008537056%3Aname-asc&pageSize=100 https://www.intel.com/content/www/us/en/products/d ocs/memory-storage/optane-persistent- memory/optane-dc-persistent-memory-brief.html

16 .07 Evaluation

17 .Evaluation 1B: Build • HNSW takes the shortest time to build the graph. • HM-ANN takes 8% longer time: the bottom-up promotion. • HM-ANN indexes are 5–13% larger than HSNW: high promotion rate from L0 to L1

18 .Evaluation 1B: Search • top-1 recall of > 95% within 1 ms • top-100 recall of > 90% within 4 ms

19 .Evaluation 1M: Search • obtaining 46% higher recall under the same search latency

20 .Summary & Q&A

21 .Warp up HNSW HM-ANN [Build] Storage Pure in-memory L0 on PMem [Build] # Passes 1 2 [Build] Promotion Rate Low High (L0 to L1) [Build] Promotion Strategy L1 Random High-degree Promotion Strategy [Build] L1 Size & Quality Low High to fullfill DRAM [Search] Strategy in L1 1-greedy search efSearchL1-greedy search [Search] # entry points to L0 1 efSearchL1 > 1 [Search] Tricks from L1 to L0 None Multi-threads, prefetching to DRAM buffer [Search] Where is the majority of seach in the L0 in fast memory, not in L0

22 .Conclusion • HM-ANN, a hierarchical graph-based similarity search algorithm to leverage emerging heterogeneous memory, aiming to serve extremely large-scale data points on a single node. • => High Recall, Low Latency, Low Memory Cost in billion scale • Releasing full power of heterogeneous memory sometimes requires a co-design of algorithm and system. • HM-ANN maps the hierarchical design of the graph-based ANNs to memory heterogeneity in Optane-based heterogeneous memory. • Combined with a set of system-level techniques, HM-ANN avoids expensive accesses in slow memory without sacrificing accuracy.

23 .References • HM-ANN: Efficient Billion-Point Nearest Neighbor Search on Heterogeneous Memory: • https://www.microsoft.com/en-us/research/publication/hm-ann-efficient-billion-point-nearest-neighbor-search-on-heterogeneous- memory/ • HM-ANN’s Slide: • http://nvmw.ucsd.edu/nvmw2021-program/nvmw2021-data/nvmw2021-paper63-presentation_slides.pdf • Poster: • http://nvmw.ucsd.edu/nvmw2021-program/nvmw2021-data/nvmw2021-paper63-poster.pdf

24 .Thank You https://milvus.io https://github.com/milvus-io/milvus 扫码加入直播交流群关注 Milvus 视频号 https://twitter.com/milvusio 与讲师实时QA 直播视频早知道

0点赞

1收藏

5下载