申请试用
HOT
登录
注册
 
Deep dive#3 Milvus 接入层和主要数据处理流程
0 点赞
1 收藏
6下载
Milvus.io
/
发布于
/
46
人观看

|Deep dive是由Milvus社区发起的代码解析系列直播,针对开源数据库 Milvus 整体架构开放式解读,与社区交流与分享 Milvus 最核心的设计理念。

对本期内容感兴趣的小伙伴,想要和讲师实时QA,欢迎大家添加小助手微信:Zilliz-tech 备注”直播“加入讨论群与大家共同交流!

本期分享大纲:

  1. Milvus2.0 系统架构回顾
  2. 代码组织结构分享
  3. 数据处理和读写请求流程
  4. Proxy代码模块介绍
  5. Q&A
展开查看详情

1.2021.08 Milvus Deep Dive #3 Access Layer and Major Data Processing Flow Cao Zhenshan zhenshan.cao@zilliz.com

2.About me • Zilliz Senior Software Engineer • Education: Master, Huazhong University of Science and Technology • Interests: Databases, Distributed systems, Spatio-Temporal data analysis and processing

3. 01 Milvus Architecture Overview 02 Code Organization C O N T E N T S 03 Major Data Processing Flow 04 Access Layer Code

4.Milvus Architecture Overview

5.Log As Data State machine Replication Principle Log is all you need to restore system state Log is a append only time sequence Next 1st Record Record t0 t1 t2 t3 t4 t5 now

6.Log Sequence Pub-sub as System Backbone Distributed log on a pub-sub systems Ø Disaggregate Log and database, make failure recovery easy and fast Ø Guarantee data durability Ø Make System extendable Ø Reduce system complexity

7.Incremental + Historical Relying only on log stream for reads is not practical (too slow) Periodically backfill history data to segments and handoff growing segments to historical. Time tick Time tick Window 1 Window 2 1 2 6 5 8 9 10

8.Micro Service Style Disaggregate storage and computation Scale independently Reduce downtime through fault isolation Easier to understand code and debugging

9.Architecture

10.Code Organization

11.Languages Golang as the distributed layer development language C++ as the engine layer language 120,000 lines of Golang 80,000 lines of C++

12.Source Code Tree - Go Directories . Project Directory /cmd: ├── Makefile Main applications for this project ├── cmd ├── configs /internal: ├── docs Private application and library code ├── go.mod ├── internal ├── go.sum ├── ruleguard.rules.go ├── scripts ├── tests └── tools

13.Source Code Tree . (internal) ├── allocator Remote Procedure Call ├── core ├── datacoord Local Procedure Call ├── datanode ├── indexcoord . (distributed) Share the same functionality ├── indexnode ├── datacoord ├── kv ├── datanode ├── log ├── indexcoord ├── metrics ├── indexnode ├── distributed ├── proxy ├── msgstream ├── querycoord ├── proto ├── querynode ├── proxy └── rootcoord ├── querycoord ├── querynode ├── rootcoord ├── storage ├── tso ├── types └── util

14.Decouple Functionality and Communication Query Data Index Root Coord Coord Coord Coord Proxy Query Data Index Node Node Node Node Msg TxnKV KV Stream Meta Log Data Pub-sub

15.Decouple Functionality and Communication Query Data Index Root Coord Coord Coord Coord Proxy Query Data Index Node Node Node Node Msg TxnKV KV Stream Meta Log Data Pub-sub

16.Source Code Tree - Packages and Directories allocator : global unique id and local buffered id core : vector/scalar search engine kv : kv interface and implementations tso : timestamp allocator msgstream: MsgStream interface and implementations metrices : monitor logic storage : data management types : type definitions util log proto

17.Major Data Processing Flow

18.Data Model

19.MsgStream Interface type MsgStream interface { Start() Close() AsProducer(channels []string) AsConsumer(channels []string, subName string) SetRepackFunc(repackFunc RepackFunc) Produce(*MsgPack) error Broadcast(*MsgPack) error Consume() *MsgPack Chan() <-chan *MsgPack Seek(offset []*MsgPosition) error }

20.Write Path DmChannels Save Binlog Files Collections may share physical channels Notify DataCoord

21.Write Path - Flowgraph Flowgraph to filter collection data

22.Write Path – MsgStream Creation When to create MsgStream

23.Read Path DqRequestChannels DqResultChannel

24.Read Path - Flowgraph Same to write path

25.Read Path – MsgStream Creation Triggered by load operation

26.Read Path Merge to maintain data completeness

27.DDL Flow Data Definition Language Ordered serial execution by timestamp

28.Index Building - IndexState

29.Index Building - IndexCoord

0 点赞
1 收藏
6下载