分布式系统介绍

A distributed system is a collection of entities, each of which is autonomous, programmable, asynchronous and failure-prone, and which communicate through an unreliable communication medium
展开查看详情

1. CS525 Advanced Distributed Systems Spring 2017 Indranil Gupta (Indy) Lecture 1 January 17, 2017 https://courses.engr.illinois.edu/cs525 1 All Slides © IG

2.What is a Distributed System? (examples) The Internet Gnutella peer to peer system A Sensor Network Datacenter/Cloud 2

3.Can you name some examples of Operating Systems? 3

4.Can you name some examples of Operating Systems? … Linux Windows Unix FreeBSD macOS 2K Aegis Scout Hydra Mach SPIN OS/2 Express Flux Hope Spring AntaresOS EOS LOS SQOS LittleOS TINOS PalmOS WinCE TinyOS iOS … 4

5.What is an Operating System? 5

6. What is an Operating System? • User interface to hardware (device driver) • Provides abstractions (processes, file system) • Resource manager (scheduler) • Means of communication (networking) • … 6

7. FOLDOC definition • The low-level software which handles the interface to peripheral hardware, schedules tasks, allocates storage, and presents a default interface to the user when no application program is running. • The OS may be split into a kernel which is always present and various system programs which use facilities provided by the kernel to perform higher-level house-keeping tasks, often acting as servers in a client-server relationship. • Some would include a graphical user interface and window system as part of the OS, others would not. The operating system loader, BIOS, or other firmware required at boot time or when installing the operating system would generally not be considered part of the operating system, though this distinction is unclear in the case of a roamable operating system such as RISC OS. • The facilities an operating system provides and its general design philosophy exert an extremely strong influence on programming style and on the technical cultures that grow up around the machines on which it runs. 7

8.Can you name some examples of Distributed Systems? 8

9.Can you name some examples of Distributed Systems? • Client-server (e.g., NFS) • The Internet • The Web • A sensor network • DNS • BitTorrent (peer to peer overlay) • Datacenters • Hadoop 9

10.What is a Distributed System? 10

11. FOLDOC definition A collection of (probably heterogeneous) automata whose distribution is transparent to the user so that the system appears as one local machine. This is in contrast to a network, where the user is aware that there are several machines, and their location, storage replication, load balancing and functionality is not transparent. Distributed systems usually use some kind of client-server organization. 11

12. Textbook definitions • A distributed system is a collection of independent computers that appear to the users of the system as a single computer. [Andrew Tanenbaum] • A distributed system is several computers doing something together. Thus, a distributed system has three primary characteristics: multiple computers, interconnections, and shared state. [Michael Schroeder] 12

13. Unsatisfactory • Why are these definitions short? • Why do these definitions look inadequate to us? • Because we are interested in the insides of a distributed system – algorithmics – design and implementation – maintenance – study 13

14.I shall not today attempt further to define the kinds of material I understand to be embraced within that shorthand description; and perhaps I could never succeed in intelligibly doing so. But I know it when I see it… [Potter Stewart, Associate Justice, US Supreme Court (talking about his interpretation of a technical term laid down in the law, case Jacobellis versus Ohio 1964) ] 14

15. A working definition for us A distributed system is a collection of entities, each of which is autonomous, programmable, asynchronous and failure-prone, and which communicate through an unreliable communication medium. • Our interest in distributed systems involves – algorithmics, design and implementation, maintenance, study • Entity=a process on a device (PC, PDA, mote) • Communication Medium=Wired or wireless network 15

16. A range of interesting problems for Distributed System designers • • P2P systems [Gnutella, Kazaa, BitTorrent] • Cloud Infrastructures [AWS, Azure, Google cloud] • Cloud Storage [Key-value stores, NoSQL, BigTable] • Cloud Programming [MapReduce, Pig, Hive, Storm, Pregel] • Coordination [Paxos] • Routing [Sensor Networks, Internet] • 16

17. A range of challenges • • Failures: no longer the exception, but rather a norm • Scalability: 1000s of machines, Terabytes of data • Asynchrony: clock skew and clock drift • Security: of data, users, computations, etc. • 17

18.Multicast 18

19. Multicast Node with a piece of information to be communicated to everyone Distributed Group of “Nodes”= Processes at Internet based hosts 19

20. Fault-tolerance and Scalability Multicast sender X  Nodes may crash  Packets may be dropped X  1000’ s of node ulticast Protocol 20

21. Centralized  Simplest implementation  Problems? UDP/TCP packets 21

22. Tree-Based  e.g., IPmulticast, S RMTP, TRAM,TMTP Lower load per node Tree setup and maintenance UDP/TCP packets  Problems? 22

23. A Third Approach Multicast sender 23

24.iodically, transmit to andom targets Gossip messages (UDP 24

25.her nodes do same ter receiving multicast Gossip messages (UDP 25

26.26

27.“Epidemic” Multicast (or “Gossip”) Infected Protocol rounds (local clo b random targets per round Gossip Message (UDP) Uninfected 27

28. Properties Claim that this simple protocol • Is lightweight in large groups • Spreads a multicast quickly • Is highly fault-tolerant 28

29. Analysis From old mathematical branch of Epidemiology [Bailey 75] • Population of (n+1) individuals mixing homogeneously • Contact rate between any individual pair is • At any time, each individual is either uninfected (numbering x) or infected (numbering y) • Then, and at all times x n, y 1 • Infected–uninfected 0 0 contact turns latter infected, and it stays infected x  y n  1 29