申请试用
HOT
登录
注册
 
Indexing methods and optimization opps
0 点赞
0 收藏
0下载
Milvus.io
/
发布于
/
253
人观看

想做向量搜索确又不知道该选择什么索引吗?

怎么样可以又快又准又最不占内存?相信这个问题很多 Milvus 用户都非常想知道。

然而,
在不同的应用场景下该如何选择一种合适的索引并非显而易见,需要在资源使用量、查询效率、查询召回率等多个指标中做权衡。

为了解答这个困扰着许多用户们的问题,这次的直播我们请到 Zilliz 负责核心索引技术的易博士,带大家一起探讨向量索引算法与索引优化的方向。通过这次直播,你将了解到:

  • 不同类型的向量索引(基于图的索引、基于聚类索引、 基于量化的索引)的工作原理,其优势/劣势与适用的场景。

  • Milvus 后续对索引优化的一些思考。

展开查看详情

1.Indexing Methods & Optimization Opportunities 易小萌

2.Background: Vector Search © 2020 Zilliz. All rights reserved.

3.Information Retrieval: from keywords to rich media © 2020 Zilliz. All rights reserved.

4.Embedding: represent rich media data as vectors a b c a b c © 2020 Zilliz. All rights reserved.

5.Vector Search: Indexing methods © 2020 Zilliz. All rights reserved.

6.Graph: HNSW/NSG/NGT Efficient and robust approximate nearest neighbor © 2020 Zilliz. All search using rights reserved. Hierarchical Navigable Small world graphs

7.Space Partition: IVF/LSH/Tree © 2020 Zilliz. All rights reserved. The inverted MultiIndex

8.Encoding: PQ/SQ © 2020 Zilliz. All rights reserved. Product quantization for nearest neighbor search

9.Comparison Fast Fast, accurate, and small, never reached at the same time… HNSW L&C FLAT ∅ IVF_PQ IVF Accurate _SQ Small © 2020 Zilliz. All rights reserved.

10.Observations © 2020 Zilliz. All rights reserved.

11.Larger nlist works with issues © 2020 Zilliz. All rights reserved. Revisiting the inverted indices for billion-scale approximate nearest neighbors

12.Larger nlist works with issues 1 0.9 0.8 0.7 Recall 0.6 1024 0.5 2048 0.4 0.3 4096 0.2 1 2 4 8 16 32 Nprobe © 2020 Zilliz. All rights reserved.

13.Larger nlist works with issues © 2020 Zilliz. All rights reserved.

14.Filter and Validation works 10 9 IVF_FLAT 8 IVF_SQ_FLAT 7 IVF_PQ_FLAT Search Time 6 5 4 3 2 1 0 0.7 0.8 0.9 1 Recall © 2020 Zilliz. All rights reserved.

15.Retrospect Fast • Larger nlist works with issues • Filter and Validation works HNSW L&C FLAT ∅ IVF_PQ IVF Accurate _SQ Small © 2020 Zilliz. All rights reserved.

16.Idea: a three-layer framework © 2020 Zilliz. All rights reserved.

17.Layers: function decomposition brings optimization opportunity Layer Data Size Candidates Requireme Function for a query nt Space Clusters Small Full Accurate, Partition Fast Candidate Compress Mediu Small Fast Filtering ed vectors m portion Result Original Large Very small Accurate Validation vectors portion © 2020 Zilliz. All rights reserved.

18.Layers: function decomposition brings optimization opportunity Layer Size Require Index Type Optimize Function ment (Adjustable) Opportunity Space Small Accurate, HNSW Cache-based Partition Fast optimization Candidate Medi Fast SQ/PQ Data locality, Filtering um inter/intra query parallelism Result Large Accurate FLAT SSD-based Storage, Validation compute-read pipeline © 2020 Zilliz. All rights reserved.

19.Q&A © 2020 Zilliz. All rights reserved.

0 点赞
0 收藏
0下载