What(’s in) the Cloud?

What(’s in) the Cloud?
展开查看详情

1. CS 525 Advanced Distributed Systems Spring 2017 Indranil Gupta (Indy) Lecture 2 What(’s in) the Cloud? January 19, 2017 All slides © IG 1

2.

3. The Hype! • Gartner (2016) – Cloud shift will affect $1 Trillion in IT by 2020 – http://www.gartner.com/newsroom/id/3384720 • Public Cloud Market $208 B by 2016 end – http://www.informationweek.com/cloud/public-cloud- market-worth-$208-billion-by-end-of-2016/d/d-id/1326923 • Hadoop market size to reach $40.6 B by 2021 – http://www.marketsandmarkets.com/PressReleases/hadoop. asp • Everyone is using public clouds: startups, large companies, non-profits, governments

4. Many Cloud Providers • AWS: Amazon Web Services – EC2: Elastic Compute Cloud – S3: Simple Storage Service – EBS: Elastic Block Storage • Microsoft Azure • Google Cloud/Compute Engine • Rightscale, Salesforce, EMC, Gigaspaces, 10gen, Datastax, Oracle, VMWare, Yahoo, Cloudera • And many many more! 4

5. Two Categories of • Clouds Can be either a (i) public cloud, or (ii) private cloud • Private clouds are accessible only to company employees • Public clouds provide service to any paying customer: – Amazon S3 (Simple Storage Service): store arbitrary datasets, pay per GB-month stored – Amazon EC2 (Elastic Compute Cloud): upload and run arbitrary OS images, pay per CPU hour used – Google AppEngine/Compute Engine: develop applications within their appengine framework, upload data that will be imported into their format, and run 5

6. Customers Save Time and $$$ • (Anecdotes from around 2012) • Dave Power, Associate Information Consultant at Eli Lilly and Company: “With AWS, Powers said, a new server can be up and running in three minutes (it used to take Eli Lilly seven and a half weeks to deploy a server internally) and a 64-node Linux cluster can be online in five minutes (compared with three months internally). … It's just shy of instantaneous.” • Ingo Elfering, Vice President of Information Technology Strategy, GlaxoSmithKline: “With Online Services, we are able to reduce our IT operational costs by roughly 30% of what we’re spending” • Jim Swartz, CIO, Sybase: “At Sybase, a private cloud of virtual servers inside its datacenter has saved nearly $US2 million annually since 2006, Swartz says, because the company can share computing power and storage resources across servers.” • 100s of startups in Silicon Valley can harness large computing resources without buying their own machines. 6

7.But what exactly IS a cloud? 7

8. What is a Cloud? • It’s a cluster! • It’s a supercomputer! • It’s a datastore! • It’s superman! • None of the above • All of the above • Cloud = Lots of storage + compute cycles nearby 8

9. What is a Cloud? • A single-site cloud (aka “Datacenter”) consists of – Compute nodes (grouped into racks) – Switches, connecting the racks – A network topology, e.g., hierarchical – Storage (backend) nodes connected to the network – Front-end for submitting jobs and receiving client requests – (Often called 3-tier architecture) – Software Services • A geographically distributed cloud consists of – Multiple such sites – Each site perhaps with a different structure and services 9

10.A Sample Cloud Topology So then, what is a cluster? 10

11. “ A Cloudy History of Time” The first datacenters! Timesharing Companies Clouds and datacenters 1940 & Data Processing Industry 1950 Clusters 1960 Grids 1970 1980 PCs 1990 (not distributed!) 2000 Peer to peer systems 2012 11

12. “ A Cloudy History of Time” First large datacenters: ENIAC, ORDVAC, ILLIAC Many used vacuum tubes and mechanical relays Berkeley NOW Project Supercomputers 1940 Server Farms (e.g., Oceano) 1950 1960 P2P Systems (90s-00s) •Many Millions of users 1970 •Many GB per day 1980 Data Processing Industry - 1968: $70 M. 1978: $3.15 Billion 1990 Timesharing Industry (1975): 2000 •Market Share: Honeywell 34%, IBM 15%, •Xerox 10%, CDC 10%, DEC 10%, UNIVAC 10% Grids (1980s-2000s): 2012 Clouds •GriPhyN (1970s-80s) •Honeywell 6000 & 635, IBM 370/168, •Open Science Grid and Lambda Rail (2000s) Xerox 940 & Sigma 9, DEC PDP-10, UNIVAC 1108 12 •Globus & other standards (1990s-2000s)

13. Trends: Technology • Doubling Periods – storage: 12 mos, bandwidth: 9 mos, and (what law is this?) cpu compute capacity: 18 mos • Then and Now – Bandwidth • 1985: mostly 56Kbps links nationwide • 2017: Tbps links widespread – Disk capacity • Today’s PCs have TBs, far more than a 1990 supercomputer 13

14. Trends: Users • Then and Now Biologists: – 1990: were running small single-molecule simulations – 2017: CERN’s Large Hadron Collider producing 30 PB/year • https://home.cern/about/computing 14

15. Prophecies • In 1965, MIT's Fernando Corbató and the other designers of the Multics operating system envisioned a computer facility operating “like a power company or water company”. • Plug your thin client into the computing Utility and Play your favorite Intensive Compute & Communicate Application – Have today’s clouds brought us closer to this reality? Think about it. 15

16. Four Features New in Today’s Clouds I. Massive scale. II. On-demand access: Pay-as-you-go, no upfront commitment. – And anyone can access it III. Data-intensive Nature: What was MBs has now become TBs, PBs and XBs. – Daily logs, forensics, Web data, etc. – Humans have data numbness: Wikipedia (large) compressed is only about 10 GB! IV. New Cloud Programming Paradigms: MapReduce/Hadoop, NoSQL/Cassandra/MongoDB and many others. – High in accessibility and ease of programmability – Lots of open-source Combination of one or more of these gives rise to novel and unsolved distributed computing problems in cloud computing. 16

17.• I. Massive Scale Facebook [GigaOm, 2012] – 30K in 2009 -> 60K in 2010 -> 180K in 2012 • Microsoft [NYTimes, 2008] – 150K machines – Growth rate of 10K per month – 80K total running Bing • Yahoo! [2009]: – 100K – Split into clusters of 4000 • AWS EC2 [Randy Bias, 2009] – 40K machines – 8 cores/machine • eBay [2012]: 50K machines • HP [2012]: 380K in 180 DCs • Google: A lot 17

18. What does a datacenter look like from inside? • A virtual walk through a datacenter • Reference: http://gigaom.com/cleantech/a-rare-look- inside-facebooks-oregon-data-center-photos-video/ 18

19.Servers Front Back In Some highly secure (e.g., financial19info)

20.Power Off-site On-site •WUE = Annual Water Usage / IT Equipment Energy (L/kWh) – low is •PUE = Total facility Power / IT Equipment Power – low is good (e.g., Google~1.1) 20

21. Cooling Air sucked in from top (also, Bugzappers) Water purified Water sprayed into air 15 motors per server bank 21

22. Extra - Fun Videos to Watch • Microsoft GFS Datacenter Tour (Youtube) – http://www.youtube.com/watch?v=hOxA1l1pQIw • Timelapse of a Datacenter Construction on the Inside (Fortune 500 company) – http://www.youtube.com/watch?v=ujO-xNvXj3g 22

23. II. On-demand access: *aaS Classification On-demand: renting a cab vs. (previously) renting a car, or buying one. E.g.: – AWS Elastic Compute Cloud (EC2): a few cents to a few $ per CPU hour – AWS Simple Storage Service (S3): a few cents per GB-month • HaaS: Hardware as a Service – You get access to barebones hardware machines, do whatever you want with them, Ex: Your own cluster – Not always a good idea because of security risks • IaaS: Infrastructure as a Service – You get access to flexible computing and storage infrastructure. Virtualization is one way of achieving this (other ways: cgroups, Kubernetes, Dockers, VMs,…). – Ex: Amazon Web Services (AWS: EC2 and S3), Eucalyptus, Rightscale, Microsoft Azure, Google Cloud. 23

24. II. On-demand access: *aaS Classification • PaaS: Platform as a Service – You get access to flexible computing and storage infrastructure, coupled with a software platform (often tightly coupled) – Ex: Google’s AppEngine (Python, Java, Go) • SaaS: Software as a Service – You get access to software services, when you need them. Often said to subsume SOA (Service Oriented Architectures). – Ex: Google docs, MS Office on demand 24

25. III. Data-intensive • Computing Computation-Intensive Computing – Example areas: MPI-based, High-performance computing, Grids – Typically run on supercomputers (e.g., NCSA Blue Waters) • Data-Intensive – Typically store data at datacenters – Use compute nodes nearby – Compute nodes run computation services • In data-intensive computing, the focus shifts from computation to the data: CPU utilization no longer the most important resource metric, instead I/O is (disk and/or network) – Hadoop clusters in some companies typically have CPU utilization rates of around 20% (while I/O-–either network or disk—is being maxed out) 25

26. IV. New Cloud Programming Paradigms • Easy to write and run highly parallel programs in new cloud programming paradigms: – Google: MapReduce and Sawzall – Amazon: Elastic MapReduce service (pay-as-you-go) – Google (MapReduce) • Indexing: a chain of 24 MapReduce jobs • ~200K jobs processing 50PB/month (in 2006) – Yahoo! (Hadoop + Pig) • WebMap: a chain of several MapReduce jobs • 300 TB of data, 10K cores, many tens of hours – Facebook (Hadoop + Hive) • ~300TB total, adding 2TB/day (in 2008) • 3K jobs processing 55TB/day – Many other frameworks: Storm/Flink/Samza, Spark, TensorFlow, … – NoSQL: MySQL is an industry standard, but Cassandra is 2400 times faster! 26

27. Two Categories of • Clouds Can be either a (i) public cloud, or (ii) private cloud • Private clouds are accessible only to company employees • Public clouds provide service to any paying customer • You’re starting a new service/company: should you use a public cloud or purchase your own private cloud? 27

28. Single site Cloud: to Outsource or Own? • Medium-sized organization: wishes to run a service for M months – Service requires 128 servers (1024 cores) and 524 TB – Same as UIUC CCT (Cloud Computing Testbed) cloud site – All costs circa 2009 • Outsource (e.g., via AWS): monthly cost – S3 costs: $0.12 per GB month. EC2 costs: $0.10 per CPU hour (costs from 2009) – Storage = $ 0.12 X 524 X 1000 ~ $62 K – Total = Storage + CPUs = $62 K + $0.10 X 1024 X 24 X 30 ~ $136 K • Own: monthly cost – Storage ~ $349 K / M – Total ~ $ 1555 K / M + 7.5 K (includes 1 sysadmin / 100 nodes) • using 0.45:0.4:0.15 split for hardware:power:network and 3 year lifetime of hardware 28

29. Single site Cloud: to Outsource or Own? • Breakeven analysis: more preferable to own if: - $349 K / M < $62 K (storage) - $ 1555 K / M + 7.5 K < $136 K (overall) Breakeven points - M > 5.55 months (storage) - M > 12 months (overall) - As a result - Startups use clouds a lot - Cloud providers benefit monetarily most from storage 29