- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 视频嵌入链接 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
Indexing methods and optimization opps
想做向量搜索确又不知道该选择什么索引吗?
怎么样可以又快又准又最不占内存?相信这个问题很多 Milvus 用户都非常想知道。
然而,
在不同的应用场景下该如何选择一种合适的索引并非显而易见,需要在资源使用量、查询效率、查询召回率等多个指标中做权衡。
为了解答这个困扰着许多用户们的问题,这次的直播我们请到 Zilliz 负责核心索引技术的易博士,带大家一起探讨向量索引算法与索引优化的方向。通过这次直播,你将了解到:
不同类型的向量索引(基于图的索引、基于聚类索引、 基于量化的索引)的工作原理,其优势/劣势与适用的场景。
Milvus 后续对索引优化的一些思考。
展开查看详情
1 .Indexing Methods & Optimization Opportunities 易小萌
2 .Background: Vector Search © 2020 Zilliz. All rights reserved.
3 .Information Retrieval: from keywords to rich media © 2020 Zilliz. All rights reserved.
4 .Embedding: represent rich media data as vectors a b c a b c © 2020 Zilliz. All rights reserved.
5 .Vector Search: Indexing methods © 2020 Zilliz. All rights reserved.
6 .Graph: HNSW/NSG/NGT Efficient and robust approximate nearest neighbor © 2020 Zilliz. All search using rights reserved. Hierarchical Navigable Small world graphs
7 .Space Partition: IVF/LSH/Tree © 2020 Zilliz. All rights reserved. The inverted MultiIndex
8 .Encoding: PQ/SQ © 2020 Zilliz. All rights reserved. Product quantization for nearest neighbor search
9 .Comparison Fast Fast, accurate, and small, never reached at the same time… HNSW L&C FLAT ∅ IVF_PQ IVF Accurate _SQ Small © 2020 Zilliz. All rights reserved.
10 .Observations © 2020 Zilliz. All rights reserved.
11 .Larger nlist works with issues © 2020 Zilliz. All rights reserved. Revisiting the inverted indices for billion-scale approximate nearest neighbors
12 .Larger nlist works with issues 1 0.9 0.8 0.7 Recall 0.6 1024 0.5 2048 0.4 0.3 4096 0.2 1 2 4 8 16 32 Nprobe © 2020 Zilliz. All rights reserved.
13 .Larger nlist works with issues © 2020 Zilliz. All rights reserved.
14 .Filter and Validation works 10 9 IVF_FLAT 8 IVF_SQ_FLAT 7 IVF_PQ_FLAT Search Time 6 5 4 3 2 1 0 0.7 0.8 0.9 1 Recall © 2020 Zilliz. All rights reserved.
15 .Retrospect Fast • Larger nlist works with issues • Filter and Validation works HNSW L&C FLAT ∅ IVF_PQ IVF Accurate _SQ Small © 2020 Zilliz. All rights reserved.
16 .Idea: a three-layer framework © 2020 Zilliz. All rights reserved.
17 .Layers: function decomposition brings optimization opportunity Layer Data Size Candidates Requireme Function for a query nt Space Clusters Small Full Accurate, Partition Fast Candidate Compress Mediu Small Fast Filtering ed vectors m portion Result Original Large Very small Accurate Validation vectors portion © 2020 Zilliz. All rights reserved.
18 .Layers: function decomposition brings optimization opportunity Layer Size Require Index Type Optimize Function ment (Adjustable) Opportunity Space Small Accurate, HNSW Cache-based Partition Fast optimization Candidate Medi Fast SQ/PQ Data locality, Filtering um inter/intra query parallelism Result Large Accurate FLAT SSD-based Storage, Validation compute-read pipeline © 2020 Zilliz. All rights reserved.
19 .Q&A © 2020 Zilliz. All rights reserved.