NoSQL v.s. NewSQL

在过去的40年中,数据库市场发生了巨大的变化,从传统的单机数据库到分布式的NoSQL / NewSQL,硬件和软件技术的发展相互交织。本文从历史分析数据库的前天,昨天和今天,带大家领略传统数据库,NoSQL数据库以及NewSQL数据库的变迁和应用场景。其结论非常有意思: 1)硬件的价格越来越便宜; 2)交易型数据库-行存储 / 数据仓库-列存储; 3)单击数据库的瓶颈在日志(logging)和锁(locking)/ 多机数据库瓶颈在数据复制(Replication),锁(locks),数据关联(joins); 4)NoSQL足够简单(No ACID,NO Joins)/ NewSQL是数据库架构重新设计(ACID / 更少的并行冲突/简化日志/智能的跨机器的数据分区策略);
展开查看详情

1.NoSQL vs. NewSQL ComputeFest Niv Dayan 13 January, 2016 Demystifying the Zoo of Contemporary Database Systems 1

2.Introduction Niv Dayan (post-doc) IACS Data Systems Lab Research in database systems Today’s workshop: NoSQL vs. NewSQL Origin Difference Hype 2

3.Introduction T ime 1980 1990 2000 2010 3

4.Introduction 1980 1990 2000 2010 4 T ime

5.Introduction Creation time of 100 most used databases ( db - engines.com ) 5

6.Introduction Different architectures Performance Data integrity User interface “Which database system is right for me?” Not a survey. P rovide reasoning tools. 6

7.Introduction Theme: any trend in database technology can be traced to a trend in hardware Claim: The new database technologies are adaptations to changes in hardware Database designer Hardware 7

8.Introduction Traditional systems 20 min Changes in hardware 10 min Market today 10 min Analytical databases Transactional databases NoSQL 15 min NewSQL 15 min MongoDB Tutorial 30 min Total: 90 minutes Not just tell you how traditional databases are designed, but tell you why they are designed that way. Due to the hardware. Modern times. How has hardware changed? What changes? Two classes of databases today 8

9.Introduction Not just tell you how traditional databases are designed, but tell you why they are designed that way. Due to the hardware. Modern times. How has hardware changed? What changes? Two classes of databases today MongoDB is the most popular NoSQL system 9

10.Traditional Systems (1/5) In 1970, two storage technologies were becoming mature enough for the market Dynamic Random Access Memory Expensive, fast, volatile Hard disk drives: C heap, slow, non-volatile 10

11. History 11

12.History 3 goals of database design Speed Affordability Resilience to system failure How you achieve them depends on hardware 12

13.History Two storage media: Main Memory Fast, expensive, volatile Disk Slow, cheap, non-volatile 13

14.History How should data be stored across them? Main memory is volatile and expensive Frequently accessed data is here All data is here 14

15.History To make a system fast, address bottleneck Disk is extremely slow Fetch Retrieve Main memory Disk 15

16.History To make a system fast, address bottleneck Disk is extremely slow Fetch Retrieve Pluto Earth Fetch Retrieve Main memory Disk 16

17.History Why so slow ? Two questions: Question 1: How to minimize disk access? Question 2: What to do during a disk access? Disk hand moving 17

18.History Problem : How to minimize disk accesses? Solution : Store data that is frequently co-accessed at the same physical location Consolidates many disk accesses to one Data item 1 Data item 2 18

19.History (3/5) How to minimize disk access? Store data that is frequently accessed together in the same physical location This gave rise to tabular structure 19

20.ID Name Balance 1 Bob 100 2 Sara 100 3 Trudy 450 History Example : Bank Co-locate all information about each customer Customer Sara deposits $100 Main Memory Disk Database 2 Sara 100 Add 100 2 Sara 200 2 disk accesses, since data about sara is co-located 20

21.History (1/7) Challenge: power failures Lose all information in main memory Example : money transfer between two accounts read Bob’s balance subtract 100, and rewrite Power Failure read John’s balance add 100, and rewrite 21

22.History (1/7) Define a transaction as a logical unit of work A transaction should happen fully, or not at all (atomicity) Save all updates from transactions into a log If power fails, redo committed transactions and undo uncommitted transactions 22

23.History (1/7) Power failure example Transfer between two bank accounts T1: read account 1 balance T1: subtract 100, and rewrite balance Power Failure T1 : read account 2 balance T1: add 100, and rewrite balance 23

24.History What to do during a disk access? Start running the next operation(s) I mproves performance But data can ge t corrupted 24

25.History A couple, Bob and Sara, share a bank account Both deposit $100 at same time balance = 0 25

26.History A couple, Bob and Sara, share a bank account Both deposit $100 at same time balance = 0 balance = 0 Fetch Retrieve 26

27.History A couple, Bob and Sara, share a bank account Both deposit $100 at same time balance = 0 balance = 0 27

28.History A couple, Bob and Sara, share a bank account Both deposit $100 at same time balance = 0 balance = 0 balance = 0 Fetch Retrieve 28

29.History A couple, Bob and Sara, share a bank account Both deposit $100 at same time balance = 0 balance = 0 balance = 0 29