- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
张磊-Containerd ShimV2 + KataContainers as Kubernetes Runtime
展开查看详情
1 .Containerd ShimV2 + KataContainers as Kubernetes Runtime Lei Zhang, Kubernetes Community
2 .
3 .
4 . Kubernetes Control Panel pod, node list api-server Workloads Scheduling Orchestration bind Etcd kubelet kubelet kubelet C C C C C C C C C C C Node Node Node
5 .Kubernetes + containerd kubelet containerd runC clone(), setns(), pivot_root() Linux Kernel C C C C C Node
6 . Linux Container /bin /dev /etc /home /lib / lib64 /media /mnt /opt /proc / • Container Runtime root /run /sbin /sys /tmp / usr /var /data /temp.txt read-write layer “echo hello” • The dynamic view and boundary of your running process Read-Write Layer & /data init layer • Namespace + Cgroups /etc/hosts /etc/hostname /etc/resolv.conf n js o • Container Image CMD [“echo hello"] js o n read-only layer VOLUME /data • t The static view of your program, data, tx p. em dependencies, files and directories /t ADD temp.txt / • rootfs FROM busybox FROM busybox ADD temp.txt / VOLUME /data CMD [“echo hello"]
7 . KataContainers • Container Runtime • Each Pod is hypervisor isolated • Independent guest kernel • Secure as VM • Fast as container • Container Image • Same as Linux container
8 . Container Security • Linux container • Dropping Linux capabilities • Read-only mount points • KataContainers • Mandatory access controls (MAC) • Hardware virtualization • SELinux & AppArmor • Dropping syscalls • Independent Linux instance per Pod • SECCOMP • e.g. run Linux 3.16 container on a Linux 4.0 host • In 99.99% cases • wrap containers in VMs
9 .Kubernetes + KataContainers kubelet ??? KataContainers virtualization Linux Kernel VM VM VM VM VM Node
10 .Container Runtime Interface (CRI) • Describe what kubelet expects from container runtimes • Imperative container-centric interface • why not pod-centric? • Every container runtime implementation needs to understand the concept of pod. • Interface has to be changed whenever new pod-level feature is proposed.
11 . CRI Spec • Sandbox • How to isolate Pod environment? • Linux container: infra container + pod level cgroups • Kata: light-weighted VM • Container • Linux container: namespace + cgroups • Kata: namespace containers controlled by hyperstart
12 . How CRI Works Management pod, node list Scheduling api-server Workloads bind Orchestration Etcd pod CRI Spec Sandbox kubelet Create Delete List client api dockershim docker kubelet pod CRI grpc Container Create GenericRuntime SyncLoop SyncPod Start Exec remote CRI shim Container Image (no-op) Runtime Pull List
13 . How CRI shim works CNI add() 1.RunPodSandbox(foo) NODE 2. CreatContainer(A) foo (vm) 3. StartContainert(A) A B foo A B $ kubectl run foo … 4. CreatContainer(B) 5. StartContainer(B) docker runtime vm runtime Pod foo container A CreatContainer() StartContainer() StopContainer() RemoveContainer() container B null Created Running Exited null
14 .Wrap Up kubelet CRI CRI shim Do your work here! KataContainers syscall Linux Kernel C C C C C Node
15 .Use containerd/cri as CRI shim
16 . But … • Too many containerd-shim, large resource footprint • CRI is a well-defined interface for Kubernetes to consume, not for runtimes • gVisor/KataContainers/VM • Un-match to existing CRI shims • Maintenance “nightmare” • e.g. cri-o VS cri-containerd + gVisor/KataContainers/VM-based runtimes, oh my …
17 . Containerd ShimV2 • A “standard interface” between CRI shim and container runtime! • CRI -> containerd -> OCI runtime • CRI -> containerd -> shimV2 -> OCI runtime
18 . What’s the difference? • Previous: • Call `containerd-shim` • This will start a shim process per container • Now: • Call `containerd-shim start` • Implement “start” operation as you wish: • Start containerd-shim when creating sandbox • Reuse existing containerd-shim when creating container
19 .Containerd + ShimV2 + KataContainers
20 .Containerd ShimV2 kubelet cri-containerd kata-containerd-shimv2 Do your work here! KataContainers virtualization Linux Kernel VM VM VM VM VM Node
21 .
22 . Live Demo • Kubernetes + containerd + shimV2 +KataContainers 1. kubeadm installed, 3 nodes cluster on GCE, nested virtualization 2. Pod lifecycle 3. Independent kernel 1. No kernel sharing with host 4. Strong isolation 1. e.g. forkbomb 5. High density, small footprint 1. 100 KataContainers in one GCE Node in 2mins
23 . Real Case • 1.5 Engineers + 1 GSoC student • Pull Request • https://github.com/kata-containers/runtime/pull/572 • Expected to be merged in next 2 weeks
24 . Read Our Story GSoC 18: Kata Containers support for containerd
25 .Thank You! Lei Zhang