展望Alluxio 2.0

范斌博士将分享Alluxio 2.0系统着眼的功能、面临的挑战,介绍开发者社区对于对RPC系统升级,完整支持异步写,数据副本的管理,以及自建的HA模式(无需依赖Zookeeper或者HDFS)等重要功能的目标、设计和进度。同时,作为Alluxio开源项目的核心开发者,范斌还将分享在过去数年中Alluxio团队总结的一些分布式系统开发的经验教训及最佳工程实践。

展开查看详情

1.Alluxio 2.0 in a Nutshell Bin Fan binfan@alluxio.com

2.About Me Bin Fan Alluxio Founding Member CS PhD @ CMU Previously worked at Google Twitter: binfan Email : binfan@alluxio.com

3.Company Overview Founded Feb. 2015 – Haoyuan Li PhD research project “Tachyon” at UC Berkeley AMPLab Venture Backed Andreesen Horowitz etc. Open Source Business Model Project site: www.alluxio.org Open Sourced in Dec. 2012 Open source v1.0 released Feb. 2016 Latest stable version v1.8.1 in Sept. 2018 Office in San Mateo, CA Team: Google, Palantir , Vmware , AMD, Cisco…

4.Fast Growing Open Source Project in Data Eco-System Fastest Growing open-source project in the data ecosystem Running in world’s largest production clusters 800 + Contributors from 100+ organizations 10/27/18 4

5.Agenda Overview 1 Case Study 2 What’s new in 2.0 4 Architecture 3 Lessons 5

6.Emerging Data Ecosystem: Bigdata + ML Many Compute Frameworks Many Storage Systems Most not co-located 10/27/18 6

7.Moving to Cloud 7 A turnkey solution to self managed data platforms on IaaS Pros Cheaper Scalable Easier to maintain Cons Vendor-dependent Many apps are not cloud-native Data locality 10/27/18

8.Problems in Data Ecosystem Complexity Costly to integrat e new compute or storage Hard to maintain data sources plug-and- play Complicated to create data pipelines Efficiency Slow and expensive to accessing remote data repeatedly Data locality remains questionable; Potential performance penalty and semantics mismatch 10/27/18 8

9.How to Address the Challenges A unified data access layer

10.VFS OS Buffer Cache Disk Device Local Application VDFS (Alluxio) Persistent Storage Distributed Application Alluxio as a New Data Access Layer 10/27/18 10

11.Alluxio in Data Ecosystem Apps only talk to Alluxio Simple Add/Remove No App Changes Highest performance in Memory No Lock in 10/27/18 11

12.Technology Overview

13.Alluxio Innovations Storage Unification Bring all files into a single interface Interact with any data using one API Accelerate slow data transparently Common Data API Intelligent Cache 10/27/18 13

14.Alluxio Innovation: Storage Unification Enables effective data management across different storages 10/27/18 14

15.Alluxio Innovation: Common Data API Convert from Client-side Interface to native Storage Bigdata Filesystem API HDFS Connector S3A Connector Swift Connector Google Cloud Connector 10/27/18 15 POSIX Filesystem API

16.Alluxio Innovation: Intelligent Cache Local performance from remote data using multi-tier storage RAM SSD HDD Hot Warm Cold Read & Write Buffering Transparent to App Policies for pinning, promotion/demotion, TTL 10/27/18 16

17.Case Study

18.100+ Known Production Deployments AND MORE! 10/27/18 18

19.Machine Learning Case Study – A Top Hedge Fund Challenge – Slow training of model for algorithmic trading in $46B data driven Hedge Fund Data access was slow, costing them $$ in compute cost and lower modeler productivity SPARK HDFS SPARK HDFS Solution – With Alluxio, data access are 10-30X faster Impact – Increased efficiency on training of ML algorithm, lowered compute cost and increased modeler productivity, resulting in 14 day ROI of Alluxio MESOS MESOS Public Internet Public Internet 10/27/18 19 Confidential © Alluxio , Inc. All Rights Reserved.

20.Big Data Case Study – Challenge – Gain end to end view of business with large volume of data Queries were slow / not interactive, resulting in operational inefficiency Solution – ETL Data from Teradata to Alluxio Impact – Faster Time to Market – “Now we don’t have to work Sundays” Use Case : http://bit.ly/2oMx95W SPARK TERADATA SPARK TERADATA 10/27/18 20 Confidential © Alluxio , Inc. All Rights Reserved.

21.Machine Learning Case Study – Challenge – Large training dataset on Azure blob store, not accessible from TensorFlow directly Repeated data access, no caching Solution – Alluxio POSIX API to serve TensorFlow Impact – Enabler for Deep Learning workloads in their environment TensorFlow Azure Blob Store TensorFlow Azure Blob Store 10/27/18 21 Confidential © Alluxio , Inc. All Rights Reserved. Read more at https ://blogs.msdn.microsoft.com/cloudai/2018/05/01/tensorflow-on-azure-enabling-blob-storage-via-alluxio /

22.HPC /Machine Learning Partnership – Alluxio maximizes GPU investment : Self-serve data access for data scientists Rapid integration of new data sources Improved memory management & performance 10/27/18 22 Confidential © Alluxio , Inc. All Rights Reserved. Learn more at https://www.slideshare.net/Alluxio/flexible-and-fast -storage -for-deep-learning-with- alluxio

23.

24.A Distributed Storage System Under the Hood Architecture

25.Alluxio Architecture Alluxio Master Zookeeper Standby Master Alluxio Worker Alluxio Worker Under Store RAM / SSD / HDD RAM / SSD / HDD Control Path Data Path

26.Read Data not Cached in Alluxio + Caching 26 RAM / SSD / HDD Application Alluxio Client Alluxio Worker Under Store 10/27/18

27.Read Cached Data in Alluxio Alluxio Worker RAM / SSD / HDD Application Alluxio Client 10/27/18 27

28.Write data only to Alluxio Alluxio Worker RAM / SSD / HDD Application Alluxio Client 10/27/18 28

29.Write to Alluxio and Under Store Synchronously RAM / SSD / HDD Application Alluxio Client Alluxio Worker Under Store 10/27/18 29