Nebula: A graph DB based on HBase

陈恒带来的使用 HBase 实现图数据库Nebula的介绍。
他首先介绍了图数据库是当前很流行的一个数据库,主要用于社交网络和知识图谱等。接下来他介绍了图数据库面临的一些挑战,包括传统的数据库读写带来的读写放大、在线海量数据查询等。然后 他介绍了 Nebula 的一些特点,包括存储于计算分离、类 SQL 查询(但不支持嵌套查询)以及与 MySQL 类似的存储引擎插件等。

展开查看详情

1.Nebula: A graph DB based on HBase Heng Chen VESoft engineer & HBase Committer

2.Graph Database is the fastest growing no-sql database

3.Common Graphs Social Network Business Relation Graph Knowledge Graph IoT

4.Use case WL WC Ranking = ( 𝑊0 𝑓 ×𝑊3 (𝑓) )∈{, - } Order by ranking top N

5.Graph Database: Challenges •Low latency (Milliseconds) •High throughput (> 100K QPS) •Fast growing data set (> 10Bn nodes, > 100Bn edges, > 10TB) •Increasing complexity of the business logic •More strict requirement for the data consistency (ACID)

6.Features • Separation of the storage and the computation layers • SQL-like nGQL, no embedding • Storage plugins (In house, HBase) • Share-nothing, scalable distributed system • Move computation, not data • Multi spaces • Index • Built-in graph algorithms

7. Architecture …… Query Engine Query Engine Meta Service Computation Layer Storage Layer KV Store storage service storage service storage service KV Store Partition 1 Partition 1 Partition 1 raft raft KV STORE … Partition 2 Partition 2 Partition 2 Partition 3 Partition 3 Partition 3 Store Engine Store Engine Store Engine

8. Architecture Based on HBASE Query Engine Query Engine Query Engine Meta Service KV Store storage service storage service storage service Computation Layer Storage Layer KV Store … HBASE

9.Query Service Query AST Client/Console Parser Execution Planner Plan Plan from cache Plan Execution Engine Optimizer er tex sch em ge/v a/a cl ed Storage Client Meta Client

10.Schema •Tag •One vertex can have multiple tags. •VertexID + TagID represents a real vertex •Edge Type •Each edge is of one edge type •A tuple [SrcID, DstID, EdgeType, Ranking] represents an edge instance •Properties(Int, Float, Double, Timestamp, etc.) •TTL •Multi versions

11.Key design •Vertex •Key: (PartID) + VID + TagID •Edge •Key: (PartID) + SrcID(dstID) + (+-)EdgeType + Ranking + DstID(srcID) •Properties: •One column one property(HBase) •KV pairs encoded in one column (In house) •

12.Query Language nGQL •SQL-like •GO FROM $id OVER $edge WHERE $condition YIELD column1, column2.. •Compossible, but not embeddable • GO … | GO … | GO …. •$var = GO … ; GO FROM $var … •Pattern matching • FIND vertex id(edge) WHERE $condition. •Expandability (UDF)

13. Thanks Homepage: http://nebula-graph.io/ Github: https://www.github.com/vesoft-inc/nebula Email: info@vesoft.com