MySQL DBaaSon Intel® Next Gen Xeon Platform

在本文中,我们将讨论开源mysql社区版和percona服务器项目以及Intel的Cascade Lake服务器平台作为托管数据库即服务(dbaas)部署的主要构建块的使用。将提供证明此配置优势的数据。重点将是部署一个产品,为开发人员提供自助式、按需交付的数据库,这些数据库在多租户环境中运行在共享基础设施上。来自不同细分市场的DBA需求促使大型云服务提供商投资于构建此类产品。
如今,这些产品从多个云提供商那里广泛提供。然而,直到最近,想要在自己的数据中心内提供DBAA的组织仍然没有好的选择。随着几个开源项目的出现,这种需求将得到解决。第二个相互交叉的趋势是Intel的下一代Cascade Lake Xeon平台即将发布,它通过Inteloptane直流持久内存产品支持字节可寻址持久内存。
我们将从对云本地DBAS模型的概述开始,深入讨论这两种趋势的交叉点。接下来,将介绍使用Percona的mysql和myrocks发行版的部署模型的具体描述。这个模型将得到一个关于性能和密度的双重数据驱动讨论的支持。首先,介绍了InnoDB和Myrocks存储引擎的性能特征,重点比较/对比了NVMe-VIS-A-VIS Intel_optane
直流持久内存的使用。对于后者,我们将Intel_optane__dc**持久内存卷包括在fsdax和扇区模式中。其次,将介绍使用单个Intel Cascade Lake服务器的数据库实例密度数据,该服务器配备了NVME VIS-A-VIS Intel
“Optane_”、DC持久内存。讨论结束时,将讨论结果以及它们如何影响我们在mysql ce/percona服务器开源项目中进一步工作的计划。

展开查看详情

1. MySQL DBaaS on Intel® Next Gen Xeon Platform David Cohen – Storage Solutions CTO & Senior Principal Engineer, Intel® Steve Scargall – Persistent Memory Software Architect, Intel®

2.Agenda • Overview of Intel’s Next Gen Platform • Overview of Persistent Memory as Block Storage • Benchmark Environment • Percona Server + MyRocks/RocksDB on Persistent Memory in Block Storage Mode • Performance Results • Future Work • Q&A 2

3.Intel’s Next Gen Platform Intel Cascade-Lake and Optane™ DC Persistent Memory

4.Redefining the Memory/Storage Hierarchy DRAM PMEM Expanding data insights with: Intel® Optane™ NVMe SSD Intel® Optane™ DC 1 Persistent and PCIe SSD persistent memory module Large Memory SATA SSD Intel® Optane™ DC SSD 2 Massively with software Extended Spinning HDD Memory Multiple affordable solutions Tape 4

5.Intel® Optane™ DC Persistent Memory Launched April, 2019 (Optane™ based Memory Module for the Data Center) Cascade Lake CPU IMC IMC * DIMM population shown as an example only. DIMM Capacity • 128GB, 256GB, 512GB Speed • 2666 MT/sec • DDR4 electrical & physical Capacity per CPU • Close to DRAM latency • 3TB (not including • Cache line size access DRAM) • Byte Addressable 5

6.Intel® Optane™ DC Persistent Memory Operational Modes App Direct Mode Memory Mode Persistent High capacity High availability / Affordable less downtime Significantly faster Ease of storage adoption† 6

7.Intel® Optane™ DC Persistent Memory Using Persistent Memory in Block Storage Mode

8. Storage over App Direct MySQL Handles Atomic Writes Block Storage Mode – Atomic & Non-Atomic (Default Behavior) • No Code Changes Required Legacy Storage APIs Storage APIs with DAX (AppDirect) • Operates in Blocks like SSD/HDD Future Work • Traditional read/write Application USER SPACE mmap • Works with Existing File Systems • Code changes may be required* Standard Load/ • Bypasses file system page cache • Atomicity at block level Raw Device Standard File API Standard File API Store Access • Requires DAX enabled file system • Block size configurable PMDK • XFS, EXT4, NTFS “DAX” • 4K, 512B* • No Kernel Code or interrupts • No interrupts • NVDIMM Driver required File System pmem- • KERNEL SPACE Fastest IO path possible • Support starting Kernel 4.2 Aware MMU Mappings File • Configurable as Boot Device BTT System * Code changes required for load/store direct access if the application does not already support this. Block Atomicity • Higher Endurance than Enterprise SSDs Generic NVDIMM Driver • High Performance Block Storage • Low Latency, Higher B/W, High IOPS HARDWARE *Requires Linux Persistent Memory Disable MySQL No Page Cache 8 Atomic Writes

9. Benchmark Environment Percona Server with MyRocks/RocksDB Storage Engine using Persistent Memory in Block Storage Mode

10. Benchmark Environment Server Intel® Server System S2600WF BIOS Version: SE5C620.86B.0D.01.0180.110720181502 DRAM & PMEM Solution Release Date: 11/07/2018 CPU 2 x Intel(R) Xeon(R) Platinum 8260L CPU @ 2.30GHz 24 cores per socket, 2 threads per core Memory DDR4 Dual Rank ECC 192GB(12*16GB@2667MHz) 1.5TB Intel® Optane™ DC Persistent Memory (12*128GB@26667MHz) 16 16 16 16 Storage 2 x 4TB Intel P4510 NVMe SSDs 192 GB 16 16 DRAM 16 16 2 x 480GB Intel P545 SSDs (boot disks) 16 16 16 16 1 x Intel 800GB SSD (scratch space) OS CentOS 7.5 (Kernel 4.17.5) 128 128 128 128 Filesystem XFS DB Version Percona Server 5.7.22 with RocksDB Test Details SysBench-TPCC Persistent 1536 GB 128 128 128 128 Memory ● 5000 Warehouses (500 Warehouses with 10 tables each) ● Up to 48 Database Threads 128 128 128 128 ● Test Duration: ~=900seconds Dataset The following shows the approximate on-disk size of the 1728 GB Sizes datasets used for testing MyRocks Intel P4510 NVMe SSDs Storage ● lz4/16KB - 90GB ● lz4/4KB - 98GB ● nocomp/16KB - 351GB ● nocomp/4KB - 353GB 10

11.Benchmark Environment Sysbench/OLTP # numactl --cpubind=1 --interleave=1 /usr/bin/envLD_PRELOAD=/usr/lib64/libjemalloc.so.1 sysbench --num-threads=48 oltp_read_write.lua --tables=32 --table-size=12000000 --report-interval=1 --rand- type=uniform --db-driver=mysql --forced-shutdown=1 --time=1800 --events=0 -- percentile=99 --mysql-user=root --mysql-db=sbtest32t12M --mysql-storage-engine=ROCKSDB --mysql-socket=/tmp/mysql.sock --index-updates=0 --non-index-updates=1 --simple- ranges=0 --skip-trx=on --sum-ranges=0 --order-ranges=0 --distinct-ranges=0 --point- selects=0 --delete_inserts=0 run Sysbench-TPCC # numactl --cpubind=1 --interleave=1 /usr/bin/envLD_PRELOAD=/usr/lib64/libjemalloc.so.1 sysbench --num-threads=48 tpcc.lua --tables=10 --report-interval=1 --rand-type=pareto --db-driver=mysql --forced- shutdown=1 --time=900 --events=0 --percentile=99 --mysql-user=root --mysql- db=tpcc500w10t --mysql-storage-engine=ROCKSDB --mysql-socket=/tmp/mysql.sock -- scale=500 --trx_level=RC run 11

12.Performance Results

13. Effect of different Block Sizes (4KB/16KB) Comparison of results for MyRocks engine with rocksdb_block_size 4KB and 16KB Percona Server 5.7.22/RocksDB # direct, no_binlog # sysbench-TPCC/5000 warehouses, 1 socket, 48 db connections • Reducing block size to 4KB can be helpful for setups with persistent memory and without block cache, providing up to 10% improvement in throughput and latency for both lz4 and uncompressed datasets. • Reducing block size may increase dataset size up to 10%. • When the block cache is enabled, RocksDB will amortize reads, providing a negligible difference between 16KB and 4KB • Note: Filesystem page cache is disabled. The top-level cache is decompressed. Block cache is only using DRAM, not the 2nd level cache (which can be on different media). 13 • MyRocks performs well with a page size of 4KB or 16KB.

14. Compression Vs No Compression Comparison of results for MyRocks engine with lz4 compression and uncompressed Percona Server 5.7.22/RocksDB # direct, no_binlog # Sysbench-TPCC/5000 warehouses, 1 socket, 48 DB connections • Results obtained for persistent memory for both lz4 and uncompressed datasets show almost no differences. Compression facts: • lz4 dataset 4x smaller than uncompressed • lz4 reduces reads rate up to 2.5x, writes rate up to 2.5x • CPU overhead of lz4 compression only 2-3% • Compression decreases the number of writes which improves media endurance while maintaining a similar level of 14 throughput and latency compared with uncompressed data. It also saves storage space and costs.

15.Moving Read Cache to Persistent Memory Comparison of results for MyRocks engine with disabled block cache(0G) and enabled (30G) Percona Server 5.7.22/RocksDB # direct, no_binlog # sysbench-TPCC/5000 warehouses, 1 socket, 48 DB connections • rockdb_block_cache • block cache enabled: the advantage of persistent memory over NVMe is up to 25% • block cache disabled: persistent memory shows a slight drop in performance versus 15 runs with block cache and in the same time outperforms NVMe up to 4x

16.Conclusion • Up to 5.7x Query Latency using Persistent Memory Vs NVMe • Up to 3x Transactions/Queries per second Vs NVMe • Persistent Memory as Storage saw up to 59% better CPU Utilization • Removed NVMe %iowait overhead • Minimal Compression Overhead • Storage cost savings with compression enabled • Reduce DRAM footprint by disabling MyRocks caching 16

17.Future Work

18.Call to Action • Community driven persistent memory enablement • Atomicity & Immediate Persistence • Faster Reads/Writes & Logging • Faster Replication & Consistency (RDMA) • Data Tiering • Higher IOPS/TPS/QPS, Lower Latency • Many more innovations 18

19.Q&A

20.Resources • Intel Cascade Lake & Optane DC Persistent Memory are available • In the Cloud • From your OEM/ODM • Persistent Memory Home – https://pmem.io • Docs - https://docs.pmem.io • Getting Started, User Guides, Tutorials, etc • Persistent Memory Development Kit (PMDK) – https://pmem.io/pmdk • Intel Developer Zone (Persistent Memory) - https://software.intel.com/pmem • PMEM Google Group - https://groups.google.com/forum/#!forum/pmem 20

21.Please Rate Our Session 21