通过真实故事理解 Kubernetes 中的 Windows 容器网络

对于企业公司,其 IT 环境中拥有大量的现有 Windows 应用资产。Windows 容器技术是一门新技术,内核与 Linux 具有相当大的区别。在 Kubernetes Windows 容器集群中原生的运行 Windows 应用是非常具有挑战性的,特别是在网络区域中。 在本介绍中,Cindy 和 Dinesh 将深入探究 Windows 容器网络技术。他们将在 Windows 容器网络机制和 Linux 机制之间进行比较,并进行可视化的类比。还将通过现实世界的案例研究,还将漫游 Kubernetes Windows CNI 插件与网络故障排除指南的举例实现。
展开查看详情

1.Understanding Windows container networking in Kubernetes using a real story Cindy Xing, Huawei Dinesh Kumar Govindasamy, Microsoft

2.Our 35min Journey: Step by Step +

3.Goals Understand windows container network inside out How to author a windows container CNI plugin How to troubleshoot windows container network issue(s)

4.Windows Container Architecture HCSSHIM Operating System Control Groups Job Objects Namespaces Object Namespace, Process Table, Networking Layer Capabilities Registry, Union like filesystem extensions Other OS Functionality Public Clouds Azure, GCE, AWS, Huawei etc … OnPrem Environments Docker EE, OpenShift, Rancher, DIY Kubernetes Host Compute Service Host Network Service Kubelet/KubeProxy KubeCTL/REST Interface Docker Shim/ContainerD CNI Plugin

5.Container Networking Basics

6.Windows Container Networking VMSwitch VFP POD NS1 SNAT, DNAT, LBNAT VXLAN ENCAP, DECAP ACLS, QoS Forwarding Ethernet Ethernet ROOT Namespace vEthernet 192.168.0.5 00:15:aa:bb:cc:dd 192.168.0.6 00:15:aa:bb:cc:a1

7.Container Metering VNET SLB ACLs Inbound (Egress) Outbound (Ingress) Virtual Filtering Platform - Overview LAYER : SLB_NAT_LAYER Friendly name : SLB_NAT_LAYER Flags : 0x9 Default Allow , Update flows on address change Priority : 50 GROUP : SLB_GROUP_NAT_IPv4_OUT Friendly name : SLB_GROUP_NAT_IPv4_OUT Priority : 300 Direction : OUT Type : IPv4 Conditions: <none> Match type : Priority-based match RULE : Friendly name : SNAT_TCP_OUTBOUNDNAT_7e101_192.168.0.143 Priority : 100 Flags : 3 terminating stateful Type : dynnat Conditions : Protocols : 6 Source IP : 172.31.1.36 Flow TTL: 240 MSS delta: 50,0 Type : Source NAT; NAT pool : SLB_NATPOOL_IPV4 FlagsEx : 0

8.

9.Huawei Windows Container Cluster Huawei SDN K8s Master Node (Linux) Linux OS IPTABLES/IPVS Ican CNI & IPAM Container Runtime Kubelet Kube -proxy CRI Storage Plugins (EVS, OBS) CNI CSI ICAgent K8s Worker Node (Windows) … Windows OS Host Compute Service (HCS) Host Network Service (HNS) Container Runtime Kubelet Kube-proxy CRI Storage Plugins (EVS, OBS) CNI CSI ICAgent Huawei K8s Cluster Ican CNI & IPAM VSWITCH VFP

10.K8s + CRI + CNI Pod Pause DockerShim CRI Shim GRPC Server CNINetworkPluginMgr Kubelet GRPC Client Docker Engine CNIPlugin Network C1 C2 EP

11.CNI and CNIplugin VSwitch VFP SNAT, DNAT, LBNAT VXLAN ENCAP, DECAP ACLS, QoS POD1 NS1 Ethernet ROOT Namespace vEthernet Ethernet Forwarding What is CNI What is CNIPlugin What CNIPlugin does How to implement CNIPlugin How to run CNIPlugin

12.CNIPlugin Execution Flow: Add

13.Host Networking Service API’s import "github.com/Microsoft/ hcsshim / hcn " Create Network : hcnNetworkConfig = & hcn.HostComputeNetwork {         Name: info.Name ,         Type: hcn.NetworkType ( info.Type ),          Ipams : [] hcn.Ipam {              hcn.Ipam {                 Type:    "Static",                 Subnets: subnets,             },         },          Dns : hcn.Dns {             Suffix:     info.DNS.Suffix ,              ServerList : info.DNS.Servers ,         },          SchemaVersion : hcn.SchemaVersion {             Major: 2,             Minor: 0,         },         Policies: hcnPolicies ,     } Delete Network : hcnNetwork , err := hcn.GetNetworkByID ( networkID )     if err != nil {         return err     }     _, err = hcnNetwork.Delete () import "github.com/Microsoft/ hcsshim / hcn " Create Namespace : func NewNetNS () (* NetNS , error) { temp := hcn.HostComputeNamespace {} hcnNamespace , err := temp.Create () if err != nil { return nil, err } return & NetNS {path: string( hcnNamespace.Id )}, nil } Delete Namespace : func (n * NetNS ) Remove() error { n.Lock () defer n.Unlock () if ! n.closed { hcnNamespace , err := hcn.GetNamespaceByID ( n.path ) if err == nil { hcnNamespace.Delete () n.closed = true } } if n.restored { n.restored = false } return nil } import "github.com/Microsoft/ hcsshim / hcn " Create Endpoint : func AddHcnEndpoint ( epName string, expectedNetworkId string, namespace string, makeEndpoint HcnEndpointMakerFunc ) (* hcn.HostComputeEndpoint , error) { createEndpoint := true hcnEndpoint , err := hcn.GetEndpointByName ( epName ) if hcnEndpoint != nil && hcnEndpoint.HostComputeNetwork == expectedNetworkId { createEndpoint = false } if createEndpoint { if hcnEndpoint , err = makeEndpoint (); err != nil { return nil } if hcnEndpoint , err = hcnEndpoint.Create (); err != nil { return nil } } hcn.AddNamespaceEndpoint ( hcnEndpoint.Id , namespace) return hcnEndpoint , nil } Delete Endpoint : func RemoveHcnEndpoint ( epName string) error { hcnEndpoint , err := hcn.GetEndpointByName ( epName ) if err != nil { return err } if hcnEndpoint != nil { hcnEndpoint , err = hcnEndpoint.Delete () if err != nil { return err } } return nil } import "github.com/Microsoft/ hcsshim / hcn " Create LoadBalancer : func ( loadBalancer * HostComputeLoadBalancer ) Create() (* HostComputeLoadBalancer , error) { jsonString , err := json.Marshal ( loadBalancer ) if err != nil { return nil, err } loadBalancer , hcnErr := createLoadBalancer (string( jsonString )) if hcnErr != nil { return nil, hcnErr } return loadBalancer , nil } Delete LoadBalancer : func ( loadBalancer * HostComputeLoadBalancer ) Delete() (* HostComputeLoadBalancer , error) { if err := deleteLoadBalancer ( loadBalancer.Id ); err != nil { return nil, err } return nil, nil }

14.Huawei Windows Container Network L2Bridge mode together with L3 overlay Routing Cbr0 endpoint is used as the default GW for each pod with forwarding enabled https://github.com/containernetworking/plugins/tree/master/pkg/hns https://github.com/containernetworking/plugins/tree/master/plugins/main/windows

15.Demo Create a Windows Cluster Check CNI plugin Setup docker, pull customer application (IIS + ASP.NET) image Deploy the app Collect tracing Walk through the trace

16.Debugging Windows Container Networking Issues Look @ Kubelet and Kubeproxy logs Look @ CNI Spec and Logs Look @ HNS State using Collect logs script Look @ VFP State if all policies are programmed correctly Get packet capture and trace the packet drops Container curls POD IP : Port HOST 1 Packet reaches Container Port, No NAT Packet gets forwarded to CBR0. TCPIP Route lookup happens and gets forwarded to external port Packet gets forwarded to the Other Host HOST 2 TCPIP Stack drops the packet due to no Endpoint Packet gets forwarded to Container PORT, NO Stateful NAT Packets reach External Port

17.Contact Details Cindy Xing Senior Cloud Architect @ Huawei Email : li.xing1@Huawei.com Linkedin:cindyxing Dinesh Govindasamy Principal Engineering Manager @ Microsoft Email : dingov@Microsoft.com Linkedin : dineshgovindasamy

18.Contact Details Cindy Xing Senior Cloud Architect @ Huawei Email : li.xing1@Huawei.com Linkedin:cindyxing Dinesh Govindasamy Principal Engineering Manager @ Microsoft Email : dingov@Microsoft.com Linkedin : dineshgovindasamy

19.Windows Network Drivers for Containers L2 Bridge “L2 Tunnel” (Public Cloud mode) Container to Container traffic bridged inside the container host All traffic forwarded to External router or (Azure) Fabric host Network policies not applied to intra-host traffic between containers Network policies for each individual endpoint enforced on physical host NAT MAC and IP Rewritten Overlay Encapsulated with an outer header L2 Bridge MAC rewritten, IP visible on the underlay network L2 Tunnel MAC rewritten, IP visible on the underlay network Nuances between L2 bridge and “tunnel”