一场一致性测试的冒险

什么是真正意义上的 Kubernetes?文献可能无法面面俱到,但运行代码才是王道。于是,我们社群通过一致性测试(这些测试运用了对任意给定的 Kubernetes 集群预期的功能性)定义了 Kubernetes。 去年,CNCF 介绍了认证的 Kubernetes 一致性项目,其使用这些测试来证实给定的 Kubernetes 分布或平台是否具有预期的表现。超过 60 名供应商已认证其 Kubernetes。我们从中学到了什么? 本次演讲将让您一览我们过去一年在改善一致性测试的保真度方面所取得的进步,以及这些测试是如何继承到上游和下游开发中的。读者应不拘泥于本次演讲,了解一致性测试提供哪些保证、他们可以如何使用这些保证以及它们会有什么帮助。
展开查看详情

1.Adventures in Conformance Aaron Crickenberger, Google @spiffxp

2.Note When I say “We” I mean “We the Kubernetes Community”

3. Cross the Chasm Primordial soup of container orchestration implementations Achieve ubiquity to become a de-facto standard Prioritize moving fast, intentionally take on technical debt - One giant repo, development practices that aren’t friendly to newcomers - Tie directly to implementations, eg: etcd, docker - Avoid clean interfaces, eg: cloud provider code

4. Make it Boring Define extension points, extract functionality - The “core”/”in-tree” functionality should use the same interfaces as everyone else - We now have CNI, CRI, CSI - We are extracting cloud provider functionality out Focus on stability and reliability Focus on portability of workloads

5. Keep it Boring Cluster Operators: Is my cluster a Certified Kubernetes cluster? Application Developers: Will my workload actually be portable? Kubernetes Developers: Is this feature ready to be considered GA? CNCF Kubernetes Certification Program Source: https://github.com/cncf/k8s-conformance

6.Keep it Boring Source: CNCF

7.What is Kubernetes Source: https://github.com/kubernetes/community/tree/master/icons

8.What is Kubernetes Source: https://github.com/kubernetes/community/tree/master/icons

9. What is Kubernetes It turns out we don’t have a spec or written standard We care about behaviors visible to the end user That work on any given conformant Kubernetes cluster

10.What is Kubernetes Source: https://github.com/kubernetes/community/blob/master/contributors/devel/architectural-roadmap.md

11.What is Kubernetes Source: https://github.com/kubernetes/community/blob/master/contributors/devel/architectural-roadmap.md

12. Improving Conformance “OK, we use it, but it barely covers any functionality” - What does that mean? Is that true? Let’s add tests to exercise more functionality - Focus on “Pod” functionality - Ensure the definition of conformance is upstream Let’s measure proxies for “functionality” coverage - What API endpoints are covered when the tests are run? - What code is covered when the tests are run?

13. Add Tests Write an e2e test Document the e2e test Demonstrate that it meets these requirements* - Tests only GA, non-optional features or APIs - Works for all providers - Is non-privileged - Works without public internet access - Binaries used are required for Linux kernel or kubelet to run - Images used support all architectures for which Kubernetes releases are built - Passes against versions of Kubernetes consistent with version skew policy - Provides consistent results without flakes Propose to SIG Architecture that the e2e test be promoted to Conformance

14.Add Tests Source: github.com/spiffxp/adventures-in-k8s-conformance

15. Add Tests - 1.9 to 1.10 [sig-api-machinery] Garbage collector should delete RS created by deployment when not orphaning [Conformance] [sig-api-machinery] Garbage collector should delete pods created by rc when not orphaning [Conformance] [sig-api-machinery] Garbage collector should keep the rc around until all its pods are deleted if the deleteOptions says so [Conformance] [sig-api-machinery] Garbage collector should not be blocked by dependency circle [Conformance] [sig-api-machinery] Garbage collector should not delete dependents that have both valid owner and owner that's waiting for dependents to be deleted [Conformance] [sig-api-machinery] Garbage collector should orphan RS created by deployment when deleteOptions.PropagationPolicy is Orphan [Conformance] [sig-api-machinery] Garbage collector should orphan pods created by rc if delete options say so [Conformance] [sig-apps] Daemon set [Serial] should retry creating failed daemon pods [Conformance] [sig-apps] Daemon set [Serial] should rollback without unnecessary restarts [Conformance] [sig-apps] Daemon set [Serial] should run and stop complex daemon [Conformance] [sig-apps] Daemon set [Serial] should run and stop simple daemon [Conformance] [sig-apps] Daemon set [Serial] should update pod when spec was updated and update strategy is RollingUpdate [Conformance] [sig-apps] StatefulSet [k8s.io] Basic StatefulSet functionality [StatefulSetBasic] Burst scaling should run to completion even with unhealthy pods [Conformance] [sig-apps] StatefulSet [k8s.io] Basic StatefulSet functionality [StatefulSetBasic] Scaling should happen in predictable order and halt if any stateful pod is unhealthy [Conformance] [sig-apps] StatefulSet [k8s.io] Basic StatefulSet functionality [StatefulSetBasic] Should recreate evicted statefulset [Conformance] [sig-apps] StatefulSet [k8s.io] Basic StatefulSet functionality [StatefulSetBasic] should perform canary updates and phased rolling updates of template modifications [Conformance] [sig-apps] StatefulSet [k8s.io] Basic StatefulSet functionality [StatefulSetBasic] should perform rolling updates and roll backs of template modifications [Conformance]

16. Add Tests - 1.10 to 1.11 [sig-api-machinery] Watchers should be able to restart watching from the last resource version observed by the previous watch [Conformance] [sig-api-machinery] Watchers should be able to start watching from a specific resource version [Conformance] [sig-api-machinery] Watchers should observe add, update, and delete watch notifications on configmaps [Conformance] [sig-api-machinery] Watchers should observe an object deletion if it stops meeting the requirements of the selector [Conformance]

17. Add Tests - 1.11 to 1.12 [k8s.io] Container Lifecycle Hook when create a pod with lifecycle hook should execute poststart exec hook properly [NodeConformance] [Conformance] [k8s.io] Container Lifecycle Hook when create a pod with lifecycle hook should execute poststart http hook properly [NodeConformance] [Conformance] [k8s.io] Container Lifecycle Hook when create a pod with lifecycle hook should execute prestop exec hook properly [NodeConformance] [Conformance] [k8s.io] Container Lifecycle Hook when create a pod with lifecycle hook should execute prestop http hook properly [NodeConformance] [Conformance] [k8s.io] InitContainer [NodeConformance] should invoke init containers on a RestartAlways pod [Conformance] [k8s.io] InitContainer [NodeConformance] should invoke init containers on a RestartNever pod [Conformance] [k8s.io] InitContainer [NodeConformance] should not start app containers and fail the pod if init containers fail on a RestartNever pod [Conformance] [k8s.io] InitContainer [NodeConformance] should not start app containers if init containers fail on a RestartAlways pod [Conformance] [sig-api-machinery] Namespaces [Serial] should ensure that all pods are removed when a namespace is deleted [Conformance] [sig-api-machinery] Namespaces [Serial] should ensure that all services are removed when a namespace is deleted [Conformance] [sig-apps] Deployment RecreateDeployment should delete old pods and create new ones [Conformance] [sig-apps] Deployment RollingUpdateDeployment should delete old pods and create new ones [Conformance] [sig-apps] Deployment deployment should delete old replica sets [Conformance] [sig-apps] Deployment deployment should support proportional scaling [Conformance] [sig-apps] Deployment deployment should support rollover [Conformance] [sig-storage] ConfigMap binary data should be reflected in volume [NodeConformance] [Conformance] [sig-storage] Secrets should be able to mount in a volume regardless of a different secret existing with same name in different namespace [NodeConformance] [Conformance] [sig-storage] Subpath Atomic writer volumes should support subpaths with configmap pod [Conformance] [sig-storage] Subpath Atomic writer volumes should support subpaths with configmap pod with mountPath of existing file [Conformance] [sig-storage] Subpath Atomic writer volumes should support subpaths with downward pod [Conformance] [sig-storage] Subpath Atomic writer volumes should support subpaths with projected pod [Conformance] [sig-storage] Subpath Atomic writer volumes should support subpaths with secret pod [Conformance]

18. Add Tests - 1.12 to 1.13 [k8s.io] Container Runtime blackbox test when starting a container that exits should run with the expected status [NodeConformance] [Conformance] [k8s.io] Kubelet when scheduling a busybox Pod with hostAliases should write entries to /etc/hosts [NodeConformance] [Conformance] [k8s.io] Kubelet when scheduling a busybox command in a pod should print the output to logs [NodeConformance] [Conformance] [k8s.io] Kubelet when scheduling a busybox command that always fails in a pod should be possible to delete [NodeConformance] [Conformance] [k8s.io] Kubelet when scheduling a busybox command that always fails in a pod should have an terminated reason [NodeConformance] [Conformance] [k8s.io] Kubelet when scheduling a read only busybox container should not write to root filesystem [NodeConformance] [Conformance] [k8s.io] Pods should support remote command execution over websockets [NodeConformance] [Conformance] [k8s.io] Pods should support retrieving logs from the container over websockets [NodeConformance] [Conformance] [sig-apps] ReplicaSet should adopt matching pods on creation and release no longer matching pods [Conformance] [sig-apps] ReplicationController should adopt matching pods on creation [Conformance] [sig-apps] ReplicationController should release no longer matching pods [Conformance] [sig-cli] Kubectl client [k8s.io] Guestbook application should create and stop a working application [Conformance] [sig-cli] Kubectl client [k8s.io] Kubectl api-versions should check if v1 is in available api versions [Conformance] [sig-cli] Kubectl client [k8s.io] Kubectl cluster-info should check if Kubernetes master services is included in cluster-info [Conformance] [sig-cli] Kubectl client [k8s.io] Kubectl describe should check if kubectl describe prints relevant information for rc and pods [Conformance] [sig-cli] Kubectl client [k8s.io] Kubectl expose should create services for rc [Conformance] [sig-cli] Kubectl client [k8s.io] Kubectl label should update the label on a resource [Conformance] [sig-cli] Kubectl client [k8s.io] Kubectl logs should be able to retrieve and filter logs [Conformance] [sig-cli] Kubectl client [k8s.io] Kubectl patch should add annotations for pods in rc [Conformance] [sig-cli] Kubectl client [k8s.io] Kubectl replace should update a single-container pod's image [Conformance] [sig-cli] Kubectl client [k8s.io] Kubectl rolling-update should support rolling-update to same image [Conformance] [sig-cli] Kubectl client [k8s.io] Kubectl run --rm job should create a job from an image, then delete the job [Conformance] [sig-cli] Kubectl client [k8s.io] Kubectl run default should create an rc or deployment from an image [Conformance] [sig-cli] Kubectl client [k8s.io] Kubectl run deployment should create a deployment from an image [Conformance] [sig-cli] Kubectl client [k8s.io] Kubectl run job should create a job from an image when restart is OnFailure [Conformance] [sig-cli] Kubectl client [k8s.io] Kubectl run pod should create a pod from an image when restart is Never [Conformance] [sig-cli] Kubectl client [k8s.io] Kubectl run rc should create an rc from an image [Conformance] [sig-cli] Kubectl client [k8s.io] Kubectl version should check is all data is printed [Conformance] [sig-cli] Kubectl client [k8s.io] Proxy server should support --unix-socket=/path [Conformance] [sig-cli] Kubectl client [k8s.io] Proxy server should support proxy with --port 0 [Conformance] [sig-cli] Kubectl client [k8s.io] Update Demo should create and stop a replication controller [Conformance] [sig-cli] Kubectl client [k8s.io] Update Demo should do a rolling update of a replication controller [Conformance] [sig-cli] Kubectl client [k8s.io] Update Demo should scale a replication controller [Conformance] [sig-storage] EmptyDir wrapper volumes should not cause race condition when used for configmaps [Serial] [Slow] [Conformance] [sig-storage] EmptyDir wrapper volumes should not conflict [Conformance]

19.Add Tests Source: github.com/spiffxp/adventures-in-k8s-conformance

20. Add Tests - Next Write more tests Focus on Pod functionality - Probes - Storage - Connectivity - Pod lifecycle - etc. Workload APIs exercise this indirectly, we need more direct

21. Requirements - Next Identify whether existing tests fit these requirements - Tag functional requirements, e.g. [Privileged] - Tag behavioral problem, e.g. [Flaky] Identify further constraints or requirements - Multiple nodes - Mixed node clusters Develop Profiles - OS-specific functionality - Optional features such as cloud provider

22.Getting Data Source: https://github.com/cncf/k8s-conformance/tree/master/v1.12/kube-up-gce

23.Getting Data Source: https://testgrid.k8s.io/conformance-all

24.Getting Data Source: https://testgrid.k8s.io/conformance-all#GCE,%20v1.13%20(dev)

25.Getting Data Source: https://gubernator.k8s.io/build/kubernetes-jenkins/logs/ci-kubernetes-gce-conformance-latest-1-13/27

26.Getting Data Source: http://gcsweb.k8s.io/gcs/kubernetes-jenkins/logs/ci-kubernetes-gce-conformance-latest-1-13/27/artifacts/

27. API Coverage Use cncf/apisnoop - Parse Kubernetes’ OpenAPI spec to find possible endpoints - Endpoint = VERB path - e.g. POST /api/v1/namespaces/{namespace}/pods - Parse Kubernetes audit log to find endpoint hits Questions we can answer: - What are the total endpoints that can be hit? - What are the actual endpoints that were hit?

28.API Coverage - by % Source: github.com/spiffxp/adventures-in-k8s-conformance

29.API Coverage - by % Source: github.com/spiffxp/adventures-in-k8s-conformance