Running Multi-process DPDK App on Kubernetes with Operator SDK

我们将讨论一种通过使用Operator SDK在Kubernetes上运行DPDK多进程应用程序的方法。我们已经开发了一种称为软补丁面板的DPDK(SPP)用于NFV环境中的服务功能链接,它可以连接在主机,虚拟机以及容器上运行的DPDK应用程序。我们可以使用Multus在Kubernetes上运行DPDK应用,但是受支持的网络接口类型仍然受到限制。 SPP具有几种类型的PMD,例如物理,虚拟主机,振铃等。我们已经通过使用Operator SDK来实现Kubernetes DPDK容器应用之间的零拷贝数据包转发,该工具包是管理Kubernetes本机应用程序的工具包。 Operator可以在Kubernetes上管理复杂的有状态应用程序,并且适用于管理多进程应用程序。对于SPP,我们定义了自定义资源管理器,用户可通过该资源管理器通过Kubernetes CLI组织流程。在实现方面,Operator SDK是用于脚手架和代码生成的一组工具,可以快速引导新项目,以便您可以快速部署应用程序。


1. x Running Multi-process DPDK app on Kubernetes with Operator SDK YASUFUMI OGAWA , NTT

2.Agenda • Background • SPP (Soft Patch Panel) • Requirements • Container CNIs for DPDK • Operator SDK • Workflow • Definitions for Custom Resources and Logics • Implementation • Demo 2

3.Background • Service Function Chaining for Cloud Native • Flexibility, Maintainability and High-Performance • Integration with cloud management system (OpenStack, Kubernetes) Load Monitoring Security Web Service L2 Switch Balancer Audio Variety kinds of service apps on VMs L3 Router MPLS Video Firewall DPI Container Container Container Container Container Container 3

4.SPP (Soft Patch Panel) • A network appliance consuming DPDK • Simple and light-weight forwarding mechanism • Implemented as a multi-process application VMs and Containers Virtual Ports Physical Ports 4

5.Architecture • DPDK multi-process application • primary: resource manager • secondary: worker process for packet forwarding, classifying, or capturing. • SPP Controller consists of CLI and backend REST-API server for managing DPDK processes. host client apps VM container Boot order app app app 1. controller resource manager 2. primary spp_primary 3. secondaries forwarder processes 4. client apps spp_nfv spp_nfv controller spp-ctl spp_nfv spp_nfv SPP CLI port port 5

6.Container Chaining Performance • High performance packet forwarding between DPDK containers via shared memory host1 zero-copy 16.00 14.00 Throughput (Mpps) 12.00 10.00 8.00 6.00 4.00 host2 port port 2.00 pktgen 0.00 2 3 4 5 6 7 8 9 10 Supermicro Mini Tower Intel Xeon D-1587 Number of containers CPU Intel Xeon-D-1587 (1.7 GHz, 32 cores) Memory 32GB Traffic: 64byte / 10GB SSD Intel SSDSC2BB240G6 OS Linux (Ubuntu 16.04 LTS) DPDK v18.02 pktgen-dpdk v3.4.9 SPP v18.02 6

7.Integration • OpenStack • Neutron ML2 plugin • Kubernetes • Kubernetes CNIs (SR-IOV, virtio) • Multus 7

8.OpenStack Plugin • ML2 plugin networking-spp for managing SPP’s network • Flat or vlan type network are available / -4 / :4 -:/ / / $ Controller Node Neutron Server : ./: / :4 / Core Neutron REST API 0 : ./: 2 - / :4 2 $ / -4 = / -:/ / = $ Neutron ML2 Plugin / :4 / $ .2- = / : 1/ $$ Mechanism Type Drivers etcd Drivers SppMechanism Driver / -4 0 : / $ :1/ : /: 2 / 1/ / :1/ Compute Nodes / -4 /: /: -:/ / /: /:$ 1/ = = = . .4 0 : $ :1/ Agent Agent / :4 : / / :4 / $ SppAgent SppAgent / 2 $ SppVf SppMir SppVf SppMir ror ror $ openstack port create p2 --network net2 $ openstack server add port server1 p2 SPP Network 8

9.Kubernetes CNIs for DPDK • Multus plugin • Several CNIs such as SR-IOV or virtio Pod DPDK App Kubernetes eth0 veth vfio virtio CNI docker0 Multus vhost bridge vSwitch flannel SR-IOV flanneld VF eth0 9

10.Requirements • Define resources, such as core or memory assignment • Define network configurations • Keep the order of booting processes • Support several types of network interfaces other than tap, SR-IOV, vhostuser, ring, etc. • High-Availability (Disaster recovery, Auto-healing) • Manage versions and phases • Rolling update and roll back • Development phase, production phase, etc. 10

11. Usecase Manage several types of with Kubernetes Clustering Manage several types of VNF Clustering for High Node Pod Availability container container container VNF VNF VNF (DPDK (DPDK App) (DPDK App) Rolling Secondary) update, or virtio virtio roll back Configure several IFs Pod vhost vhost between Keep booting container container container procs order and assign resouces SPP SPP SPP ring Secondary ring Secondary Primary to each app 11

12.Operator SDK • Enables developers to build Operators • Operator is an application-specific controller that extends Kubernetes to create, configure, and manage instances of complex applications. • Make it easy to manage complex stateful applications on top of Kubernetes • Combination of CRD and Controller • Tools for scaffolding and code generation to quickly bootstrap a new project 12

13.How Operator Works ? • Operators are processes connecting to the master API and watching for events, typically on a limited number of resource types. • Implemented as a collection of controllers where each controller watches a specific resource type • When a relevant event occurs on a watched resource a reconcile cycle is started. • Controller checks that current state matches the desired state described by the watched resource watches managers Other Etcd Master API Operator system get/create/update 13

14.OperatorHub Many Operator for apps for which Kubernetes itself cannot support for app’sspecific requirements are released on 14

15.Operator SDK Workflow The SDK provides the following workflow to develop a new operator 1. Create a new Operator project using the SDK command line interface 2. Define new resource APIs by adding Custom Resource Definitions (CRDs) 3. Specify resources to watch using the SDK API 4. Define the Operator reconciling logic in a designated handler and use the SDK API to interact with resources 5. Use the SDK CLI to build and generate the Operator deployment manifests 15

16.Scaffolding (1) Create a new operator project $ operator-sdk new spp-operator INFO[0000] Creating new Go operator 'spp-operator'. INFO[0000] Created go.mod INFO[0000] Created tools.go INFO[0000] Created cmd/manager/main.go INFO[0000] Created build/Dockerfile INFO[0000] Created build/bin/entrypoint INFO[0000] Created build/bin/user_setup INFO[0000] Created deploy/service_account.yaml INFO[0000] Created deploy/role.yaml INFO[0000] Created deploy/role_binding.yaml INFO[0000] Created deploy/operator.yaml INFO[0000] Created pkg/apis/apis.go INFO[0000] Created pkg/controller/controller.go INFO[0000] Created version/version.go INFO[0000] Created .gitignore INFO[0000] Validating project go: finding master .... 16

17.Project Scaffolding Layout 17

18.Project Scaffolding Layout • Contains manager/main.go which is the main program of the operator cmd • Manager registers all custom resource definitions under pkg/apis/... and starts all controllers under pkg/controllers/... • Defines the APIs of the Custom Resource Definitions(CRD) • Users edit the pkg/apis/<group>/<version>/<kind>_types.go files to define the API for pkg/apis each resource • Controllers to watch for these resource types • Contains the controller implementations pkg/controller • Users edit pkg/controller/<kind>/<kind>_controller.go to define the controller's reconcile logic for handling a resource type of the specified kind build • Contains the Dockerfile and build scripts used to build the operator. • Contains various YAML manifests for registering CRDs, setting up RBAC, and deploying deploy the operator as a Deployment. Gopkg.toml • The Go Dep manifests that describe the external dependencies of this operator. Gopkg.lock • The golang vendor folder that contains the local copies of the external dependencies that vendor satisfy the imports of this project. Go Dep manages the vendor directly. 18

19. Add Custom Resource Definition (2) Define new resource APIs by adding Custom Resource Definitions $ operator-sdk add api ¥ --kind=Spp INFO[0000] Generating api version for kind Spp. INFO[0000] Created pkg/apis/spp/group.go INFO[0000] Created pkg/apis/spp/v1/spp_types.go INFO[0000] Created pkg/apis/addtoscheme_spp_v1.go INFO[0000] Created pkg/apis/spp/v1/register.go INFO[0000] Created pkg/apis/spp/v1/doc.go INFO[0000] Created deploy/crds/spp.yasufum.github.com_v1_spp_cr.yaml INFO[0000] Created deploy/crds/spp.yasufum.github.com_spps_crd.yaml INFO[0000] Running deepcopy code-generation for Custom Resource group versions: [spp:[v1], ] INFO[0008] Code-generation complete. INFO[0008] Running OpenAPI code-generation for Custom Resource group versions: [spp:[v1], ] INFO[0017] Created deploy/crds/spp.yasufum.github.com_spps_crd.yaml INFO[0017] Code-generation complete. INFO[0017] API generation complete. 19

20.Add Custom Resource Definition pkg/apis/<kind>/<kind>_types.go Add definitions of resources as JSON schema 20

21.Add Controller (4) Define reconciling logic in a designated handler and use the SDK API to interact with resources $ operator-sdk add controller ¥ --kind=Spp INFO[0000] Generating controller version for kind Spp. INFO[0000] Created pkg/controller/spp/spp_controller.go INFO[0000] Created pkg/controller/add_spp.go pkg/controller/<kind>/<kind>_controller.go 21

22. Add Controller User to modify with their own Controller business logic and models pkg/controller/<kind>/<kind>_controller.go Add models and business logics to the template Model for defining Pods Functions for managing Pods and Apps 22

23.YAML Definitions Define resources and configurations in spec Controller and SPP primary SPP Secondary 23

24.Activate Operator and Custom Resources (5) Use the SDK CLI to build and generate the Operator deployment manifests # Setup CRD $ kubectl apply -f deploy/crds/spp.example.com_sppc_crd.yaml # Setup Service Account $ kubectl create -f deploy/service_account.yaml # Setup RBAC $ kubectl create -f deploy/role.yaml $ kubectl create -f deploy/role_binding.yaml # Deploy the app-operator $ kubectl create -f deploy/operator.yaml # Deploy custom resources $ kubectl apply -f deploy/crds/spp.example.com_sppc_cr.yaml 24

25.Pods and Services on Kubernetes $ kubectl get all NAME READY STATUS RESTARTS AGE pod/spp-ctl-pod 1/1 Running 0 66s pod/spp-l2fwd-pod 1/1 Running 0 37s pod/spp-nfv1-pod 1/1 Running 0 57s Operator is also deployed on pod/spp-operator-prototype-cc77d6f64-cnljf 1/1 Running 0 18m pod/spp-primary-pod 1/1 Kubernetes Running 0with DPDK apps 61s pod/spp-testpmd-pod 1/1 Running 0 6s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) service/kubernetes ClusterIP <none> 443/TCP service/spp-ctl-service ClusterIP <none> 5555/TCP,6666/TCP,7777/TCP service/spp-operator-prototype ClusterIP <none> 8383/TCP NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/spp-operator-prototype 1/1 1 1 18m NAME DESIRED CURRENT READY AGE replicaset.apps/spp-operator-prototype-cc77d6f64 1 1 1 18m 25

26.Inspect Operator’s Log $ kubectl logs spp-operator-prototype-cc77d6f64-cnljf EventSource","controller":"spp-controller","source":"kind source: /, Kind="} Start operator’s controller and {"level":"info","ts":1573179618.0237935,"logger":"kubebuilder.controller","msg":"Starting EventSource","controller":"spp-controller","source":"kind source: /, Kind="} service ... {"level":"info","ts":1573179704.4574835,"logger":"controller_spp","msg":"Creating a new Service","Service Name":"spp-ctl-service","Service Spec":{"ports":[{"name":"primary","protocol":"TCP","port":5555,"targetPort":5555},{"name":"secondary","protocol":"TC P","port":6666,"targetPort":6666},{"name":"cli","protocol":"TCP","port":7777,"targetPort":7777}],"selector":{"spp":" ctl"}}} Launch DPDK container app ... with options Spec":{"volumes":[{"name":"hugepages","hostPath":{"path":"/dev/hugepages"}},{"name":"dpdk","hostPath":{"path":"/var/ run"}},{"name":"tmp","hostPath":{"path":"/tmp"}},{"name":"nic","hostPath":{"path":"/sys/devices"}},{"name":"lib","ho stPath":{"path":"/lib/sys/systemd-coredump"}}],"containers":[{"name":"container","image":"sppc/spp- ubuntu:18.04","command":["/bin/bash","-c"],"args":["spp_vf -l 2,10,12,14,16 -n 4 1024,0 --proc-type secondary -- -- client-id 2 -s --vhost-client"],"resources":{"limits":{"hugepages- 1Gi":"1Gi","memory":"1Gi"}},"volumeMounts":[{"name":"hugepages","mountPath":"/dev/hugepages"},{"name":"dpdk","mountP ath":"/var/run"},{"name":"tmp","mountPath":"/tmp"},{"name":"nic","mountPath":"/sys/devices"},{"name":"lib","mountPat h":"/lib/sys/systemd-coredump"}],"securityContext":{"privileged":true}}],"hostPID":true}} ... Confirm Primary is Running {"level":"info","ts":1573179717.802315,"logger":"controller_spp","msg":"Primary Pod Confirm DPDK container apps Status","Request.Namespace":"default","Request.Name":"spp-demo","Status":"Active"} are running {"level":"info","ts":1573179717.8023398,"logger":"controller_spp","msg":"Current Running Nfv Pods","Request.Namespace":"default","Request.Name":"spp-demo","Nfv Pods":0} {"level":"info","ts":1573179717.8024166,"logger":"controller_spp","msg":"Current Running Vf Pods","Request.Namespace":"default","Request.Name":"spp-demo","Vf Pods":1} ... 26

27.Inspect Operator’s Log .... Reconcile if status is updated =============================== Reconcile ================================== type: mirror secId: 1 by user with kubectl URL: CR Components: [{6 mir1-1 [{ring:1 { 0 0}}] [{ring:2 { 0 0}} {ring:3 { 0 0}}] mirror} {8 mir1-2 [{ring:3 { 0 0}}] [{phy:1 { 0 0}} {phy:2 { 0 0}}] mirror}] CR ClassifierTable: [] Response from REST API: {0 [] [] []} **********************Reconcile Components Delete Phase******************************* **********************Reconcile Components Add Phase******************************* HTTP Request failed : POST &map[name:mir1-1 core:6 type:mirror] HTTP Request succeeded : POST &map[type:mirror name:mir1-1 core:6] HTTP Request succeeded : PUT &map[action:attach dir:rx port:ring:1] ... Spec":{"volumes":[{"name":"hugepages","hostPath":{"path":"/dev/hugepages"}},{"name":"dpdk","hostPath":{"path":"/var/ run"}},{"name":"tmp","hostPath":{"path":"/tmp"}}],"containers":[{"name":"container","image":"pipe- test:18.04","command":["/bin/bash","-c"],"args":["pipe_test --lcores 4 -n 4 --file-prefix=test-vhost-1 --base- virtaddr 0x100000000 --single-file-segments --vdev virtio_user1,path=/tmp/sock1,server=1 -- virtio_user1"],"resources":{"limits":{"hugepages- 1Gi":"1Gi","memory":"1Gi"}},"volumeMounts":[{"name":"hugepages","mountPath":"/dev/hugepages"},{"name":"dpdk","mountP ath":"/var/run"},{"name":"tmp","mountPath":"/tmp"}],"securityContext":{"privileged":true}}]}} {"level":"info","ts":1573179730.2634828,"logger":"controller_spp","msg":"Reconciling Spp","Request.Namespace":"default","Request.Name":"spp-demo"} {"level":"info","ts":1573179730.2671142,"logger":"controller_spp","msg":"CTL and CTL ... 27

28.Demo 28

29.Summary • Requirements for flexible and high-performance networking for Cloud Native era • SPP is an effort for implementing flexible and easy-to-use SFC with DPDK technologies • Managing DPDK multi-process app from Kubernetes • Operator SDK is a framework to manage complex stateful applications on top of Kubernetes • Multi-process can be managed from Kubernetes without extra efforts by using Operator SDK 29