Why your RDBMS’s fails at scale
1.Why your RDBMS’s fails at scale And one built for it…..
2. DataStax: from validation to momentum. 400+ $190M 500+ Employees Funding Customers Founded in April 2010 Santa Clara • San Francisco • Austin • London • Paris • Berlin • Tokyo • Sydney (Series E – Sept. 2014) 30% + 2016, 2017 World’s Best Ranked #1 in multiple operational 100 Cloud Companies database categories © 2017 DataStax, All Rights Reserved. Company Confidential
3. Let’s take Your business interacts with people, a moment processes and things all the time © 2017 DataStax, All Rights Reserved. Company Confidential
4. Real-time, globally distributed cloud applications must meet expectations. CONTEXTUAL ALWAYS-ON REAL-TIME DISTRIBUTED SCALABLE © 2017 DataStax, All Rights Reserved. Company Confidential
5. Netflix disrupted video distribution and creation with a cloud application 70 million 400 125 million Customers Cities Hours Watched per Day 5 © DataStax, All Rights Reserved.
6. Microsoft remains a leader in collaboration with a cloud application 60 Million #1 5 Million Monthly Active Users Deployed App Events Per in Enterprises Organization a Month 6
7. No Downtime: 4 Black Fridays in a row 7 © DataStax, All Rights Reserved.
8. Potted history of the Database 1995 First internet based applications arrive Database and the Internet 1993 WWW finally available 1986 SQL becomes 1983 official birth of international 1970 Invented by internet or TCP/IP standard E.F. Codd at IBM 1979 First commercial RDBMS available (Oracle V2)
9.Explosion of Cloud Applications
10.Some of the issues faced • How do you scale the database? • Add more RAM • Add more CPU • Add faster and more disks • How do you do this? • Bring the database OFFLINE • Vertical scaling has a finite limit
11.Some of the issues faced Connection Pool • How do you scale client connections? Listener • Add a connection pool • But this has a finite limit • Adds complexity
12.Single Points of Failure • With a single database we have a SPOF • Use replication • Problem solved • But now • Single Master • Scales for Reads not Writes • Action needed if Master goes down • Only suitable for LAN deployments Read Only Master Subscriber
13.But How to Horizontally Scale L A-M o a d • Shard your data across databases b D a l • Each shard needs a replica a n • Need a load balancer N-Z c e • Just showing 2 shards r • Things get more complicated • Could have multiple read only subscribers Read Only Master Subscriber
14.What about multiple Data Centres? • Extremely complicated • Difficult to support Active Active • Need to consider conflicts • More Disaster Recovery than Disaster Avoidance
15.Traditional Data Models don’t help • Normalised Data Model 1:M • Random seeks result in high volume of I/O operations M:N • Joins extremely expensive • Won’t scale horizontally • De-Normalised Data Model • Sequential seek to return results • Joins eliminated • Scales indefinitely
16.Summary • Traditional Databases developed before the web and cloud based applications • Scaling up results in downtime • Single node is a single point of failure • Number of client connections finite • Add a read only replica for high availability • Shard to horizontally scale • Data Center support extremely difficult • Data model not built for horizontal scale
17. A new approach is required Client/Server Web Cloud 1990s 2000s Today 17 © DataStax, All Rights Reserved.
18. Scaling out solves the distributed problem SCALE-OUT APP LAYER SCALE-OUT DATA LAYER MASTER-SLAVE DATABASE 18 © DataStax, All Rights Reserved.
19. So What’s the answer? • Distributed masterless NoSQL Database • Continuous Availability • Disaster Avoidance • Linear Scale Performance San London Francisco • Add nodes to scale • Runs on Commodity Hardware • Cloud or on Premise or Hybrid New York
20. Linear Scalability • Have More Data? Add more nodes. • Need More Throughput? Add more nodes. 9000 Nodes 700 Nodes 400 Nodes http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html
21.Continuous Availability • Nodes Down != Database Down • Datacenter Down != Database Down • Upgrade != Database Down
22. Platform for Cloud Applications DataStax is a registered trademark of DataStax, Inc. and its subsidiaries in the United States and/or
23. The most innovative companies use DataStax 2010 2012 2014 2016 2017 23 © DataStax, All Rights Reserved.
24. Key takeaways DataStax: The power behind the moment • Why your RDBMS fails at scale • Fundamentally not built for cloud based applications • World’s leading brands rely on DataStax for globally distributed data management • Next steps: Download today at www.datastax.com and register for your DataStax Academy account for free online training 24
25. Backup Slides 25
26. ACID is a lie with data replication When applying RDBMS to Big Data replication, ACID collapses Scenario: client with read-heavy workload decides to add asynchronous replication, so there is lag for propagating data from master to the slave. • Consistency: If a client decides to do a read to the slave before the data is replicated, it’s going to get the old data back, which means loss of consistency • Atomicity: not having the correct data results in the failure of entire transaction • Isolation: receiving the old data means loss of isolation • Durability: client will receive the old data and not the data it had written to the master node 26
27. CAP tradeoffs Can’t be both consistent and highly available during a network partition • Relational databases choose strong consistency over high availability • Latency between data centers makes consistency impractical • NoSQL databases like Cassandra choose high availability and partition tolerance over consistency. • Data is replicated asynchronously across multiple data centers. We are LIMITED by the speed of light making consistency impossible. • Lets you specify consistency level (one replica vs majority of replicas) suitable for your application 27
28. Replication Complexity in RDBMS *Source: Oracle Database 12c New Features, Slide 17. (http://bit.ly/1MIxKc1) © 2015 DataStax, All Rights 28 Reserved.
29. DSE Real-time Analytics IoT Reference Architecture Streaming Analytics HTTP Application Message Queue Batch Analytics Real-time © 2015 DataStax, All Rights 29 Reserved.