- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
Ocean Base Taobao
展开查看详情
1 . OceanBase taobao-obdevelop@list.alibaba- inc.com 1
2 . The Challenge of Big Data Operating Data @Taobao, 2010 370 millions registered users 60 millions independent visitors per day more than 2 billions page view per day more than 800 millions items on line more than 800 items sold per second GMV:>1 billion per day The amount of data may increase several times, or even dozens of times in the next few years. RDBMS Sharding may be unpractical Reference:http://www.alibuybuy.com/posts/52702.html 2
3 . Comparison of existing solutions Bigtable HBase scalability Trillions of Percolator records Dynamo (PB) 100 billions of Cassandra OceanBase records (hundreds of TB) Billions of records (TB) RDBMS 10 millions of Transaction & records consistency (hundreds of GB) Eventually Single-row Cross-table NoSQL system consistent transaction transaction Big data、scalability、fault-tolerant No cross-table transaction、weak consistency model 3
4 . Storage requirement decomposition Oceanbase: online storage system with big data and complex data relation Key feature of online storage: large data scale but recent write is relatively small Separating incremental data with historical data Historical data:large scale, distributed, SAS or SSD; Incremental data:relatively small,centralized, memory or SSD; 4
5 . OceanBase Architecture RootServer/ RootServer/ UpdateServer UpdateServer (Primary) (backup) Client (Java/C++) ChunkServer/ ChunkServer/ ChunkServer/ ChunkServer/ MergeServer MergeServer MergeServer MergeServer RootServer(Master): Primary/backup,meta table / cluster management /schema… UpdateServer(Incremental data): Primary/backup,real time write (Memory+SSD) ChunkServer(Historical data): Distributed,static data (SAS or SSD) MergeServer:Distributed,Merge historical and incremental data => final result Daily Merge:historical data + incremental data => new historical data 5
6 . OceanBase Data Distribution Incremental Data Regions (COW B+ Tree) (Meta Table) Rootserver Updateserver Chunkserver 1 Chunkserver 2 Chunkserver 3 Chunkserver 4 6/26
7 . Oceanbase Timeline Historical data Incremental data (Chunkserver) (Updateserver) 7/26
8 . Scalability & Reliability Scalability ChunkServer Automatically add and remove machine UpdateServer (Performance) Memory+SSD, multiple NICs, 10Gb NIC Reading from several backups Reliability ChunkServer Replication, default 3 copies UpdateServer & RootServer Commit log + RAID 1 Synchronous real-time backup locally Almost real-time backup remotely 8
9 . Transaction and Consistency Transaction Centralized write transaction and distributed read transaction Supporting cross-row and cross-table transaction Consistency Strong Consistency: Successful mutation should be applied at both primary and backup 9
10 . Load balance & Lock elimination Automatic load balance Coordinated by RootServer Load factor: memory and disk usage, read/write load, etc. Data migration will not disturb regular read/write service Lock elimination ChunkServer: read only, no lock needed UpdateServer: Copy-on-write B+ tree, no lock for read 10
11 . Other Features Other Features Online schema change Abandoned random disk write, well suited for SSD Built-in data compression Non-stop system upgrade 11
12 . Oceanbase Performance Oceanbase Performance 4 ChunkServer, 2 * E5520 @2.27HZ, 10 * 300GB SAS, 16GB 2 billions records, 3 * 160GB (compressed by LZO), 10KB block, random read, cache is closed Theoretical value:4 * 180 * 10 = 7200;OceanBase:5100 12
13 . UpdateServer Performance UpdateServer Performance 2 * E5520 @ 2.27HZ, 24G, 1Gb NIC Size(byte) 20 100 1024 2048 QPS 78000 76000 70000 55000 Context Switch 26W 25W 21W 13W Optimization Point The first policy (Reduce malloc/free ): QPS > 10W The second policy (Reduce context switch): QPS > 20W Conclusion:UPS won’t be bottleneck 13
14 . Online Application @Taobao Collect Mysql 16*2=> Oceanbase 12+2 the load of machine decreased dramatically CTU: 2.5 billions records (2.5TB) per day MongoDB => Oceanbase 5 instance,500GB per instance SNS feed index:Cassandra => Oceanbase Shop decoration system More than 10 billions records, several hundreds of millions of real time read and write operations stably running on line for more then 6 monthes 14
15 . Open Source Community http://code.taobao.org/project/view/587/ http://oceanbase.taobao.org/ Open source time:2011/08/31 Licence: GPL Google search: more than 90 thousands results in one month after open source Technical Speaking Industry: Database Technology Conference China, System Architect Conference China , Taobao Open Develop Conference Community: OpenSourceCamp Dependency: Pacemaker, Linux operating system 15
16 . Oceanbase Future Plan Supports SQL-Like query language Supports distributed index Column-oriented storage Supports multi-machine parallel computing Supports distributed UpdateServer Integrates Hadoop MapReduce Sustained performance optimization Full open source … 16