14/01 - Cassandra in Industry

Cassandra In Industry

1.Cassandra In Industry Niall Milton, CTO, DigBigData

2. Overview 1. What is Apache Cassandra? 2. Who uses Apache Cassandra? 3. Real Use Cases 4. Questions © DigBigData 2014 2

3. What is Apache Cassandra? Massively scalable open source NoSQL database Dynamic SQL-like data modeling and querying Master-less architecture, no SPOF Linear scalability, high availability / reliability Tuneable consistency New Core Value : Ease of Use! © DigBigData 2014 3

4. Who Uses Apache Cassandra? © DigBigData 2014 4

5. And… © DigBigData 2014 5

6. Drivers… Java .NET Python Ruby & PHP on the way Feeling Energetic? Use native protocol to roll your own client. © DigBigData 2014 6

7. Use Case : Netflix The Business 33 million members in 40 countries Almost 100% deployed in Amazon Cloud Biggest single source of internet traffic in terms of volume More than 1 billion videos delivered each month Core data services served from Cassandra since 2010 100s of nodes split into isolated clusters per service Managed and Deployed via Netflix OSS © DigBigData 2014 7

8. Use Case : Netflix Why Cassandra? Low Latency & latency variance Linear scaling of reads and writes Each cluster uses nodes from different availability zones Ring is self repairing after outages Supports node backups & snapshots They found CL One suited their needs for most use cases They run massive simulations in Amazon to test their assumptions re. latency, data growth etc. © DigBigData 2014 8

9. Use Case : EBay The Business 75 billion dollars in goods sold per year 112 million ACTIVE users 400+ million items for sale Billions of page views / day Running 1000s of servers Processing TB per second We get it, Ebay is big! © DigBigData 2014 9

10. Use Case : EBay Architecture Mixed Data Architecture, also using Mysql, Oracle, MongoDB & Hadoop Over 100 cassandra nodes deployed with 9 billion writes / day & 5 billion reads / day Ethos is to use the right tool for a particular job. Cassandra is good for sparse data, big data, flexible schemas & real-time analytics Many use cases don’t require RDBMS © DigBigData 2014 10

11. Use Case : EBay Why Cassandra? Multi-DC, active-active configuration. Less waste, no dark nodes. Always available Easily scaled up and down High write throughput Distributed counters Hadoop support © DigBigData 2014 11

12. Use Case : EBay But what does it do? Time series data, real time insights and immediate actions Anti-fraud Order and shipping insights Server metrics collection for monitoring and alerting Personalization & taste graphing Runtime quality click pricing for affiliates Mobile notification logging and tracking © DigBigData 2014 12

13. Use Case : Hailo The Business World’s highest rated taxi app Over 500,000 registered passengers Hailo e-hail is accepted every 4 seconds globally Operating in 10 cities after just 18 months of operation (as of 2013) Processes over 100million dollars in customer & driver transactions. © DigBigData 2014 13

14. Use Case : Hailo Why Cassandra? Historically using MySQL in AWS Rapid global expansion required higher scalability Global replication desired Growth forecasts indicated high growth rate. Cassandra easily scaled to meet this demand Some prior engineering experience and confidence in the technology Using Acunu for some analytics work. It uses Cassandra under the hood. © DigBigData 2014 14

15. Conclusion Numerous examples of companies adopting Cassandra to answer the demands of high volume workloads Easily supports mixed workloads and is highly tuneable to favour read, write or both Supports rapid service rollouts (Netflix have built an entire development culture around this) Where ease of scale, flexible schema design, high availability and hurricane survival are required Cassandra meets the need. © DigBigData 2014 15

16. Questions? © DigBigData 2014 16

17. Questions? ? © DigBigData 2014 17