Hybrid collaborative tiered storage with Alluxio

下载 2

献良

发布于

2288

人观看

#信息技术

应用程序从AWS S3或者阿里云OSS读取数据时，通常都会有严重的性能问题，毕竟是要通过远程网络。Alluxio可以提供一个透明的数据缓存层，自动缓存需要读取远端OSS/S3数据，但是Alluxio本身什么时候拉取远端数据呢？默认全部缓存？还是按需缓存？这个PPT里将会介绍Alluxio的层次化存储概念，结合ZFS系统，最大化性能并减小应用程序开发难度。

展开查看详情

1 .Hybrid collaborative tiered storage with Alluxio Thai Bui Data Engineer @ Bazaarvoice

2 .Bazaarvoice ● Founded in 2005 in Austin, TX ● Digital marketing SaaS platforms for ratings and reviews ○ Display & syndicate reviews from brands to retailer websites ○ Reporting & analytics on consumers, reviews, products, etc. ● 2,600 client websites ● 5.4 billion product page views each month ● 900 million unique shoppers each month

3 .Reporting & analytics on S3 When you have 100s of TB of data on S3 ● Just listing the files is slow ● Download speed in EC2 is limited (50-150Mb/s per node) ● No concept of cache ● No concept of data locality

4 .AWS S3 : The Need For Speed ● Add tiered storage to S3 ○ Hot, warm, cold storage (fastest, fast, and not so fast) ○ Metadata cache ○ Data cache ● Keep data local ○ In the same machine, not via the Ethernet cable ● Compatible with existing services ○ Hadoop, Spark, Hive, Presto, etc. ● Adaptive & highly configurable ○ Symlink for S3

5 .Overview ● Alluxio App2 App1 Spark ○ Distributed data storage ○ Hadoop compatible Metastore Cold ○ By AMPLab S3 ● ZFS Alluxio ○ OS-level file system Hot & Warm ○ Volume manager ○ By Sun Microsystems ZFS ● Both are open-source

6 .Alluxio : The tiered-storage layer ● Support for native filesystem and Hadoop filesystem ● Distributed and can be installed on every node ○ Provides data locality ● Mount S3, HDFS, etc. to Alluxio ○ Think symlink. No data movement. ● Use Hive metastore to partition data into hot/warm and cold region ○ Acts as a remote tiered-storage layer

7 .ZFS : The acceleration layer ● Both a filesytem & a volume manager ○ Mirror write to 2 SSDs -> 2x read speed ● Works at the Linux kernel-space ○ Works with RAM to accelerate read/write ○ Auto promote/demote blocks from RAM to other storage ○ Used with local NVMe SSD if data is not in RAM ○ Acts as a local tiered-storage layer ● Extremely reliable ○ Automatic block checksum & repair

8 .ZFS + NVMe: Micro benchmark I3.4xlarge, up to 10Gbit network, 2 x 1.9 NVMe SSD ● Baseline w/ EBS ○ 135 MB/s write (dd if=/dev/zero of=/tmp/test1.img bs=1G count=1 oflag=dsync) ○ 157 MB/s read (dd if=/tmp/test1.img of=/dev/zero bs=8k) ● ZFS + 2 mirrored NVMe SSD ○ 820 MB/s write (dd if=/dev/zero of=/alluxio/fs/test1.img bs=1G count=1) ○ 1.7 GB/s read (dd if=/alluxio/fs/test1.img of=/dev/zero bs=1G count=1) ● 4x write, 10x read compared to EBS ● 10-15x compared to S3

9 .With ZFS Native/Hadoop Filesystem API Alluxio User-space Kernel-space ZFS Hot RAM promote demote Warm NVMe SSD

10 .With Hive Hive Metastore Last 30 Cold S3 > 30 days days Hot & Alluxio Warm

11 .CPU/IO Monitoring

12 .Tiered storage Monitoring

13 .Alluxio Monitoring

14 .Hive Monitoring & Performance Scanning 5G of data in tiered storage, 350M rows, fewer projections Scanning 200G of data in tiered storage, 500M rows, select *

15 . Scanning 35G of data in S3, 1.6B rows, count Metadata/split calculation ops distinct 60s, majority of the time spent on scanning S3

16 .Result ● 5-10X read improvement in Hive ○ Worker can short-circuit and read directly from ZFS instead of S3 ○ Move compute to the data ● Easy to debug, with feedback loop, collaborative ○ Data publishers + data analysts/scientists ● Good for iterating over the same data set multiple times ○ Machine learning ○ Exploratory analysis ● Give us control over S3 ○ More recent data should be faster to access

17 .Question?

10点赞

4收藏

2下载