申请试用
HOT
登录
注册
 
Deep dive#3 Milvus 接入层和主要数据处理流程
Milvus.io
/
发布于
/
107
人观看

|Deep dive是由Milvus社区发起的代码解析系列直播,针对开源数据库 Milvus 整体架构开放式解读,与社区交流与分享 Milvus 最核心的设计理念。

对本期内容感兴趣的小伙伴,想要和讲师实时QA,欢迎大家添加小助手微信:Zilliz-tech 备注”直播“加入讨论群与大家共同交流!

本期分享大纲:

  1. Milvus2.0 系统架构回顾
  2. 代码组织结构分享
  3. 数据处理和读写请求流程
  4. Proxy代码模块介绍
  5. Q&A
展开查看详情

1 .2021.08 Milvus Deep Dive #3 Access Layer and Major Data Processing Flow Cao Zhenshan zhenshan.cao@zilliz.com

2 .About me • Zilliz Senior Software Engineer • Education: Master, Huazhong University of Science and Technology • Interests: Databases, Distributed systems, Spatio-Temporal data analysis and processing

3 . 01 Milvus Architecture Overview 02 Code Organization C O N T E N T S 03 Major Data Processing Flow 04 Access Layer Code

4 .Milvus Architecture Overview

5 .Log As Data State machine Replication Principle Log is all you need to restore system state Log is a append only time sequence Next 1st Record Record t0 t1 t2 t3 t4 t5 now

6 .Log Sequence Pub-sub as System Backbone Distributed log on a pub-sub systems Ø Disaggregate Log and database, make failure recovery easy and fast Ø Guarantee data durability Ø Make System extendable Ø Reduce system complexity

7 .Incremental + Historical Relying only on log stream for reads is not practical (too slow) Periodically backfill history data to segments and handoff growing segments to historical. Time tick Time tick Window 1 Window 2 1 2 6 5 8 9 10

8 .Micro Service Style Disaggregate storage and computation Scale independently Reduce downtime through fault isolation Easier to understand code and debugging

9 .Architecture

10 .Code Organization

11 .Languages Golang as the distributed layer development language C++ as the engine layer language 120,000 lines of Golang 80,000 lines of C++

12 .Source Code Tree - Go Directories . Project Directory /cmd: ├── Makefile Main applications for this project ├── cmd ├── configs /internal: ├── docs Private application and library code ├── go.mod ├── internal ├── go.sum ├── ruleguard.rules.go ├── scripts ├── tests └── tools

13 .Source Code Tree . (internal) ├── allocator Remote Procedure Call ├── core ├── datacoord Local Procedure Call ├── datanode ├── indexcoord . (distributed) Share the same functionality ├── indexnode ├── datacoord ├── kv ├── datanode ├── log ├── indexcoord ├── metrics ├── indexnode ├── distributed ├── proxy ├── msgstream ├── querycoord ├── proto ├── querynode ├── proxy └── rootcoord ├── querycoord ├── querynode ├── rootcoord ├── storage ├── tso ├── types └── util

14 .Decouple Functionality and Communication Query Data Index Root Coord Coord Coord Coord Proxy Query Data Index Node Node Node Node Msg TxnKV KV Stream Meta Log Data Pub-sub

15 .Decouple Functionality and Communication Query Data Index Root Coord Coord Coord Coord Proxy Query Data Index Node Node Node Node Msg TxnKV KV Stream Meta Log Data Pub-sub

16 .Source Code Tree - Packages and Directories allocator : global unique id and local buffered id core : vector/scalar search engine kv : kv interface and implementations tso : timestamp allocator msgstream: MsgStream interface and implementations metrices : monitor logic storage : data management types : type definitions util log proto

17 .Major Data Processing Flow

18 .Data Model

19 .MsgStream Interface type MsgStream interface { Start() Close() AsProducer(channels []string) AsConsumer(channels []string, subName string) SetRepackFunc(repackFunc RepackFunc) Produce(*MsgPack) error Broadcast(*MsgPack) error Consume() *MsgPack Chan() <-chan *MsgPack Seek(offset []*MsgPosition) error }

20 .Write Path DmChannels Save Binlog Files Collections may share physical channels Notify DataCoord

21 .Write Path - Flowgraph Flowgraph to filter collection data

22 .Write Path – MsgStream Creation When to create MsgStream

23 .Read Path DqRequestChannels DqResultChannel

24 .Read Path - Flowgraph Same to write path

25 .Read Path – MsgStream Creation Triggered by load operation

26 .Read Path Merge to maintain data completeness

27 .DDL Flow Data Definition Language Ordered serial execution by timestamp

28 .Index Building - IndexState

29 .Index Building - IndexCoord

0 点赞
2 收藏
13下载
确认
3秒后跳转登录页面
去登陆