虚拟化

本文主要介绍了虚拟化,什么是虚拟化,虚拟化有什么好处?虚拟化是指通过虚拟化技术将一台计算机虚拟为多台逻辑计算机,是可拓展性研究成果之一。虚拟化可以用于多路复用多机硬件机器,云计算、数据中心虚拟化等。然后学习了如何使用虚拟化。
展开查看详情

1.Advanced Operating Systems (CS 202) Virtualization

2. Virtualization •  One of the natural consequences of the extensibility research we discussed •  What is virtualization and what are the benefits? 2

3. Virtualization motivation •  Cost: multiplex multiple virtual machines on one hardware machine –  Cloud computing, data center virtualization –  Why not processes? –  Why not containers? •  Heterogeneity: –  Allow one machine to support multiple OS’s –  Maintaining compatibility •  Other: security, migration, energy optimization, customization, … 3

4. How do we virtualize? •  Create an operating system to multiplex resources among operating systems! –  Exports a virtual machine to the Operating systems –  Called a hypervisor of Virtual Machine Monitor 4

5.VIRTUALIZATION MODELS 5

6. Two types of hypervisors •  Type 1: Native (bare metal) –  Hypervisor runs on top of the bare metal machine –  e.g., KVM •  Type 2: Hosted –  Hypervisor is an emulator 6 –  e.g., VMWare, virtual box, QEMU

7. Hybrid organizations •  Some hybrids exist, e.g., Xen –  Mostly bare metal –  VM0/Dom0 to keep device drivers out of VMM 7

8. Stepping back – some history •  IBM VM 370 (1970s) •  Microkernels (late 80s/90s) •  Extensibility (90s) •  SIMOS (late 90s) –  Eventually became VMWare (2000) •  Xen, Vmware, others (2000s) •  Ubiquitous use, Cloud computing, data centers, … –  Makes computing a utility 8

9. Full virtualization •  Idea: run guest operating systems unmodified •  However, supervisor is the real privileged software •  When OS executes privileged instruction, trap to hypervisor who executes it for the OS •  This can be very expensive •  Also, subject to quirks of the architecture –  Example, x86 fails silently if some privileged instructions execute without privilege 9 –  E.g., popf

10. Example: Disable Interrupts •  Guest OS tries to disable interrupts –  the instruction is trapped by the VMM which makes a note that interrupts are disabled for that virtual machine •  Interrupts arrive for that machine –  Buffered at the VMM layer until the guest OS enables interrupts. •  Other interrupts are directed to VMs that have not disabled them

11. Binary translation--making full virtualization practical •  Use binary translation to modify OS to rewrite silent failure instructions •  More aggressive translation can be used –  Translate OS mode instructions to equivalent VMM instructions •  Some operations still expensive •  Cache for future use •  Used by VMWare ESXi and Microsoft Virtual Server •  Performance on x86 typically ~80-95% of native 11

12. Binary Translation Example Guest OS Assembly Translated Assembly do_atomic_operation: do_atomic_operation: cli call [vmm_disable_interrupts] mov eax, 1 mov eax, 1 xchg eax, [lock_addr] xchg eax, [lock_addr] test eax, eax test eax, eax jnz spinlock jnz spinlock … … … … mov [lock_addr], 0 mov [lock_addr], 0 sti call [vmm_enable_interrupts] ret ret 12

13. Paravirtualization •  Modify the OS to make it aware of the hypervisor –  Canavoid the tricky features –  Aware of the fact it is virtualized •  Can implement optimizations •  Comparison to binary translation? •  Amount of code change? –  1.36% of Linux, 0.04% for Windows 13

14. Hardware supported virtualization (Intel VT-x, AMD-V) •  Hardware support for virtualization •  Makes implementing VMMs much simpler •  Streamlines communication between VM and OS •  Removes the need for paravirtualization/binary translation •  EPT: Support for shadow page tables •  More later… 14

15.NUTS AND BOLTS 15

16. What needs to be done? •  Virtualize hardware –  Memory hierarchy –  CPUs –  Devices •  Implement data and control transfer between guests and hypervisor •  We’ll cover this by example – Xen paper –  Slides modified from presentation by Jianmin Chen 16

17. Xen •  Design principles: –  Unmodified applications: essential –  Full-blown multi-task O/Ss: essential –  Paravirtualization: necessary for performance and isolation

18.Xen

19.Implementation summary 19

20. Xen VM interface: Memory •  Memory management –  Guest cannot install highest privilege level segment descriptors; top end of linear address space is not accessible –  Guest has direct (not trapped) read access to hardware page tables; writes are trapped and handled by the VMM –  Physical memory presented to guest is not necessarily contiguous

21. Two Layers of Virtual Memory Physical address à machine address Host OS’s View of RAM Virtual address à physical address 0xFFFFFFFF Guest OS’s View of RAM Guest App’s 0xFFFF View of RAM Page 2 0xFF Page 0 Page 0 Page 3 Page 2 Page 1 Page 1 Page 3 Page 0 Page 3 Page 1 0x00 Page 2 Unknown to the Known to the 0x0000 guest OS guest OS 0x00000000

22. Guest’s Page Tables Are Invalid •  Guest OS page tables map virtual page numbers (VPNs) to physical frame numbers (PFNs) •  Problem: the guest is virtualized, doesn’t actually know the true PFNs –  The true location is the machine frame number (MFN) –  MFNs are known to the VMM and the host OS •  Guest page tables cannot be installed in cr3 –  Map VPNs to PFNs, but the PFNs are incorrect •  How can the MMU translate addresses used by the guest (VPNs) to MFNs? 22

23. Shadow Page Tables •  Solution: VMM creates shadow page tables that map VPN à MFN (as opposed to VPNàPFN) Guest Page Table Physical Memory VPN PFN 64 Page 3 •  Maintained by the 00 (0) 01 (1) 48 guest OS Page 2 •  Invalid for the MMU 01 (1) 10 (2) 32 Virtual Memory Page 1 10 (2) 11 (3) 16 64 11 (3) 00 (0) Page 0 Page 3 0 48 Page 2 32 Shadow Page Table Machine Memory Page 1 16 VPN MFN 64 0 Page 0 00 (0) 10 (2) Page 3 •  Maintained by the 48 VMM 01 (1) 11 (3) Page 2 •  Valid for the MMU 32 10 (2) 00 (0) Page 1 16 Page 0 23 11 (3) 01 (1) 0

24. Building Shadow Tables •  Problem: how can the VMM maintain consistent shadow pages tables? –  The guest OS may modify its page tables at any time –  Modifying the tables is a simple memory write, not a privileged instruction •  Thus, no helpful CPU exceptions :( •  Solution: mark the hardware pages containing the guest’s tables as read-only –  If the guest updates a table, an exception is generated –  VMM catches the exception, examines the faulting write, updates the shadow table 24

25. More VMM Tricks •  The VMM can play tricks with virtual memory just like an OS can •  Balooning: –  The VMM can page parts of a guest, or even an entire guest, to disk –  A guest can be written to disk and brought back online on a different machine! •  Deduplication: –  The VMM can share read-only pages between guests –  Example: two guests both running Windows XP 25

26. Xen VM interface: CPU •  CPU –  Guest runs at lower privilege than VMM –  Exception handlers must be registered with VMM –  Fast system call handler can be serviced without trapping to VMM –  Hardware interrupts replaced by lightweight event notification system –  Timer interface: both real and virtual time

27. Details: CPU •  Frequent exceptions: –  Software interrupts for system calls –  Page faults •  Allow “guest” to register a ‘fast’ exception handler for system calls that can be accessed directly by CPU in ring 1, without switching to ring-0/Xen –  Handler is validated before installing in hardware exception table: To make sure nothing executed in Ring 0 privilege. –  Doesn’t work for Page Fault

28. Xen VM interface: I/O •  I/O –  Virtualdevices exposed as asynchronous I/O rings to guests –  Event notification replaces interrupts

29. Details: I/O 1 •  Xen does not emulate hardware devices –  Exposes device abstractions for simplicity and performance –  I/O data transferred to/from guest via Xen using shared-memory buffers –  Virtualized interrupts: light-weight event delivery mechanism from Xen-guest •  Update a bitmap in shared memory •  Optional call-back handlers registered by O/S