许建辉-海量数据需求下数据库架构与应用的演进

下载 0

快召唤伙伴们来围观吧
微博 QQ QQ空间 贴吧
文档嵌入链接
<iframe src="https://www.slidestalk.com/dbaplus/Evolution_of_Database_Architecture_and_Application?embed" frame border="0" width="640" height="360" scrolling="no" allowfullscreen="true">复制
微信扫一扫分享
已成功复制到剪贴板

dbaplus社群

发布于

5年前

2802

人观看

#信息技术

许建辉老师深入讲解了SequoiaDB实现原生分布式数据对传统MySQL等关系型数据库的兼容，并就期间碰到的问题展开分析，分享了解决问题的实践举措，让大家加深了对新一代NewSQL数据库的理解。

展开查看详情

1 .海海量量数据下分布式数据库设计实践许建辉

2 .Agenda • Introduction • Dive in distributed database technologies • Comparison of different technologies • Introduction to SequoiaDB

3 .Introduction • SequoiaDB is a distributed relational database • SequoiaDB provides a multimodel database engine with relational storage and object storage • SequoiaDB is deployed in more than 100 customers in financial indusry

4 .Agenda • Introduction • Dive in distributed database technologies • Comparison of different technologies • Introduction to SequoiaDB

5 .Dive in technologies CAP Theorem • High Consistency • High Availability • Partition Tolerance • Only satisfy 2 of 3

6 . Dive in technologies • Distribution – Partitioning/Fragmentation/Sharding, horizontally assign each record to 1/n partitions – Vertically break down of the schema – Transparent on fragmentation, location, replication, local mapping, naming – Composite partition(Multi-dimension ) • Components – Query parsing, access plan creation and rule lookup – Rule based distribution, connection handler – Result aggregation • Challenges: change

7 .Dive in technologies • Transaction – Global transaction management – Get all resource first, 2 phase commit – Transparent to application

8 • Lock based – – – State Being Requested X None yes yes IN (Intent None) yes no IS (Intent Share) yes no no NS (Scan Share) yes no S (Share) yes no no IX (Intent Exclusive) no SIX (Share with no U (Update) yes no no X (Exclusive) yes no no Z (Super Exclusive) no NW (Next Key Weak Exclusive) no .Dive in technologies (Isolation level) • MVCC based provides point-in-time Simpler to implement by using different lock consistent view mode – Read is never blocked Read/write can block each other – Snapshot isolation with vacum process 2-PL and deadlock detection , more challenge in – Latest data+undo log distributed environment – Higher memory/storage footprint + CPU overhead State of Held Resource None IN IS NS S IX SIX U Z NW yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes no yes yes yes yes no no yes no yes yes yes yes yes no no yes no yes yes yes no no yes no no no no Intent Exclusive) yes yes yes no no no no no no no yes yes yes yes no no no no yes no no no no no no no yes no no no no no no no no no yes yes no yes no no no no no no

9 .Agenda • Introduction • Dive in distributed database technologies • Comparison of different technologies • Introduction to SequoiaDB

10 .Traditional MySQL replication strategy APP APP APP Read/ Read Read Write Only Only MySQL Parser MySQL Parser MySQL Parser binlog shipping InnoDB InnoDB InnoDB MySQL Service（Master） MySQL Service（Standby） MySQL Service（Standby）

11 .Traditional MySQL replication strategy • Synchronous/Semi-Sync/Async Replication - Data duplication - sync/semi-sync slow - async fast but risk of data loss • Failure detection and take over process

12 .Comparision of Database Architecture 垂直分库分库分表原生分布式数据库 • 起点比较早，应用控制能力强， • 构建中间SQL解析层，尽可能将标 • 数据库内部处理分布式事务与数可进行深度定制化准SQL拆分成多个子查询下压到下据切分逻辑，对于应用程序完全层数据库，在SQL层进行结果拼装透明，不需感知底层数据分布优 • 对于底层数据库没有任何特殊要求，完全在应用程序内部进行分 • 对于底层数据库无特殊要求，在 • 数据库内部原生支持分布式事务，势库中间件进行SQL切分（支持XA即可）性能远远高于分库分表 • 部分兼容传统SQL，应用程序开发 • 高可用与容灾能力由数据库内核难度小于垂直分库原生支持，不需额外辅助工具 • 应用程序逻辑侵入性极强，应用 • 应用程序逻辑侵入性较强，应用 • 技术较新，业界成熟案例相对较程序需要进行复杂逻辑才能进行程序需感知底层数据分布结构，少合理数据分布才能设计出优化后的查询逻辑劣 • 拓扑结构调整或扩容时非常痛苦， • 中间件实现分布式事务，跨库事 • 辅助工具相对较少，生态环境有待完善势几乎不可能完成在线扩容务使用XA机制，性能大幅度下降 • 很难支持跨库事务 • 作为单点向新型分布式数据库转型的过渡阶段，技术延续性堪忧

13 .Distributed Database Business Type 联机交易（OLTP）统计分析（OLAP）联机服务（Operational） • 面向高并发低延迟，涉及 • 面向低并发高延迟，统计 • 面向高并发低延迟，不涉事务的交易类型业务报表类后台业务及事务的联机只读业务 • 要求满足CAP，其中CP完 • 不需要满足CAP，数据可 • 要求满足AP，数据可批处全满足，A无限接近100% 重新生成导入理写入且能够重新导入 • 尽可能兼容传统SQL开发 • 最大化吞吐量，行列混合 • 最小化延迟，最大化并发模式，降低应用迁移成本，存储模式度减少学习过程 • 大数据技术合理引入，结 • 结构化非结构化混合使用 • 面向新型微服务体系架构，构化非结构化同时应用 • 主要面向历史数据、实时多种一致性混合支持，多 • MPP体系结构只读服务、影像平台等租户与物理隔离能力

14 . Different distribution implementation Application Use agent to handle different Native distributed DB separation DB/Table TDDL MyCat SequoiaDB、GaussDB Application Application Application middleware（routing/dispatch） Core Loan CRM Bill DB1 DB2 DB3 DB4 X X X cluster cluster cluster cluster cluster cluster cluster cluster cluster cluster cluster cluster 将不同模块的数据表分库存储，库 Core Core X X X 核心DB2 核心DB3 核心DBn 核心DB1 核心DB2 核心DB3 核心DBn 核心DB1 间不相互关联查询，如果有，必须通过数据冗余或在应用层二次加工 Loan 信贷DB1 信贷DB2 信贷DB3 信贷DBn Loan 信贷DB1 信贷DB2 信贷DB3 信贷DBn 来解决，对应用程序侵入较大。 CRM CRM- CRM- CRM- CRM- X X X CRM CRM- CRM- CRM- CRM- DB2 DB3 DBn How to choose database with DB1 DB2 DB3 DBn DB1 big table scenario. Bill 票据DB1 票据DB2 票据DB3 票据DBn Bill 票据DB1 票据DB2 票据DB3 票据DBn 将表分布到不同机器的库上，减轻数据库的压力 Written in JAVA, does support distribution, 物理机的CPU、内存、网络IO负载分摊。支持分 RW separation, support weak XA, fail over. 布式事务。 However, single point failure,compute bottleneck, error-pron HA.

15 .Agenda • Introduction • Dive in distributed database technologies • Comparison of different technologies • Introduction to SequoiaDB

16 .巨杉分布式关系型数据库 Micro- Micro- Micro- Micro- service1 service2 service3 service4 SQL SQL SQL SQL SQL SQL Instance Instance Instance Instance InstanceInstance （MySQL）（MySQL）（PGSQL）（SparkSQL） SQL实例层 Storage Storage Storage Storage Instance Instance 存储层 Instance Instance

17 .巨杉数据库“计算存储分离” 分布式数据库架构 SQL解析区 MySQL服务每个服务均可 MySQL服务 MySQL服务进行读写操作协调协调协调协调元数据管理区 Sequoia 节点节点节点节点 DB 数据存储区分主副本1 数据节点数据节点数据节点数据节点数据节点数据节点编目节点布式数据数据数据数据数据数据编目节点存从副本2 节点节点节点节点节点节点储数据数据数据数据数据数据编目节点引从副本3 节点节点节点节点节点节点擎分区1 分区2 分区3 分区4 分区5 分区6

18 .100% MySQL compatible and more APP APP APP Readw Read Read rite write write MySQL Parser MySQL Parser MySQL Parser MySQL Service（Master） MySQL Service（Master） MySQL Service（Master） Read Read Read write write write SequoiaDB Distributed Database

19 .Support two dimensional partition SequoiaDB support horizontal and vertical partition. Usually choose unique key for horizontal partition，use range cluster key like time-stamp for vertical partition. Suit for snapshot data and streaming data respectively Advantage： linear scale for capacity and performance

20 .计算存储分离架构兼容MySQL • 使用comment设置MySQL不支持的特性数据组1 数据组2 数据组3 – 分区信息 A: [20180101, 20180201) – 不指定分区键则默认使用第一个字子表1 段 A: [20180201, 20180301) 子表2 – 支持多维分区功能主 A: [20180301, 20180401) 表 – 其他子表3 A: [20180401, 20180501) 子表4 A: [20180501, 20180601) 子表5 mysql> create table mainCl(a int, b text, c timestamp) engine = sequoiadb comment ="{table_options:{IsMainCL:true,ShardingKey:{c:1},ShardingType:\"range\"}}";

21 .计算-存储分离：MySQL兼容内部设计解析 • Handler Adapter：作为适配层与MySQL进行适配对接， APP APP APP APP 实现与表相关的操作 • Data Parser：负责数据记录以及字段的解析 • Index：负责索引的解析、创建以及索引遍历控制 • Condition Parser：负责解析查询条件 MySQL • Optimizer Proxy：作为优化器的代理，实现统计信息等收集 Handler Adapter • Config Mgr：管理存储引擎相关的配置参数 • SE Handler Pool：存储引擎的句柄资源管理池 Data Parser Index Condition Parser • SE Handler Adapter：实现与存储引擎对接适配 Optimizer Proxy Config Mgr SE Handler Pool SE Handler Adapter Storage Engine

22 .Native SQL and HTAP support Risk E-bank Bank Counter ATM Audit Analyze management JDBC Connection JDBC Connection MySQL PostgreSQL SQL Service Layer SparkSQL ………… Coordination Node Layer Data Node Layer Private deposit area Credit card area Channel log area Open transaction, strong Open transaction, strong Close the transaction and consistency consistency eventually consistency

23 .Tradition Isolation • Support RU/RC/RR LRB • Combination of locking and versioning • Read is not blocked • More in next release LRB

24 .THANKS! Q&A SequoiaDB Website： www.sequoiadb.com Github： SequoiaDB/SequoiaDB SequoiaDB/sequoiasql-mysql Join SequoiaDB Community 加入SequoiaDB社区！

0点赞

0收藏

0下载