- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
在 eBay 上管理生产级工作负载的 Operator 框架
展开查看详情
1 .Operator Framework to manage Stateful Workloads in eBay
2 .AGENDA l Background l Introduction l Features l Conclusion
3 .BACKGROUND San Jose Data Platform Data Data Data eBay has the data platform and shared data ecosystem to provide the off-line and real- Las Vegas time data that power eBay's vision of a buyer Data experience, seller insights and agile, data driven business to serve the ebay employees Phoenix all around the world. C 190+ OUNTRY D 600PB ATA
4 .BACKGROUND High IOSP Local Persistent Volume Disk failure
5 .BACKGROUND Example: Tiered Kafka Architecture Example: Active-Active Mysql Cluster
6 .BACKGROUND Simplify Management Improve Reliability Security Multiple Kubernetes Clusters Always Available Integrate with existing Internal Dependency enterprise security policies Resiliency Complex Data Operation Highly Scalable o Scaling High Performance o Rolling Restart
7 .BACKGROUND StatefulSet: 1. Cross K8s clusters deployment 2. Auto Recover for Disk Failure 3. Defined order for rolling restart Helm Charts: 1. On-demand configuration changes for related components 2. Customize the docker images
8 .INTRODUCTION Operator Pattern + Workflow Web Codin g Passc ode mental Enviro
9 .INTRODUCTION Operator Framework: One Stack Management for Stateful Workloads
10 .FEATURES – Deploy Pattern 1. Declarative interface and Free combination 2. Model data applications with K8s resources • Pure Pods + Local Volume • Deployment • DeploymentSets • StatefulSet
11 .FEATURES – Deploy Pattern 3. Cross Kubernetes Clusters Deployment and Management
12 . FEATURES – Lifecycle Management 1. On-demand Management functions to reduce the maintain efforts Provision/ Decommission Auto Remediation Rest API and Kubernetes native If one node is down, the WISB WISB Management Workflow API for one step create or delete flow will automatically triggered to a cluster bring back the missed node • Declarative with simple syntax via yaml, easy to Scaling Rolling Restart / Upgrade modify and maintain The cluster could on-demand Upgrade the cluster to new • Decompose complex logic with idempotent reusable flexup and flexdown version and retryable tasks • Design for failure Configuration Management • Reusable and parallelism subflows Replacement Modify the application Replace the bad node or low • Special design for group operation configuration parameters and performance node update to the cluster
13 .FEATURES – Lifecycle Management 2. Reusable common flow and customize WISB flow
14 . FEATURES – Reliability Health Check Sidecar Agent • Scheduling triggered HealthCheck Flow to check app status • Proxy admin actions and permission control • Send notification by Email or Stack if any unhealthy detected • View Log and collect metrics • Alerting (Disk healthy/ Disk usage) • Secret rotation
15 . 15 FEATURES – Security Authentication Keystone authentication and integration with LDAP RBAC Grant the CRU permission to specific namespace, only the user have the CRU permission could manage the cluster . Non-root user security context / setcap Standard approve process sudo action / human approve / Trace system
16 .Conclusion Operator Framework We utilize the operator pattern, and benefit from the workflow engine to build a framework to makes it easier and more agile to manage all the multi-component, GEO-distributed stateful workloads.
17 .Thanks !
18 .