JD Kubernetes 平台电子商务应用的无服务器计算

Fibonacci 是 Jd.com 电子商务平台上的企业级无服务器计算服务。其在 OpenFaaS 顶层构建,包括显著的增强和新功能,更适合大规模生产应用。具体来说,Fibonacci 提供与京东内部容器云平台的无缝集成,该平台是世界上最大的 Kubernetes 生产集群。新功能包括 FaaS 优化弹性调度、网络控制面板和 GPU 支持。Fibonacci 一直用于驱动京东商城 (JD Mall) 上的应用程序,例如,产品图像智能处理、自动化 IT 操作和聊天机器人,大幅简化应用程序的开发和部署,并提高资源效率。我们将介绍 Fibonacci 的设计和实现,分享我们在大规模电子商务平台上开发和应用无服务器计算的经验和教训

1.Fibonacci: JD’s Function as a Service Platform Yuan Chen, Xin Tong, Hui Tu, Dongdong Dai, Junyuan Zeng, Fuze Sun JD.com

2.About JD.com China’s largest retailer, online or offline • The third largest Internet company by revenue • Over 300 million active users A Fortune Global 200 company Largest nationwide e-commerce logistics infrastructure in China • Covering 99% of the population • Able to deliver 90% of orders same- or next-day Strategic partnerships

3.JD Technological Infrastructure Provide and manage containerized infrastructure and platform for JD retail, finance and logistics businesses • Everything in containers • One of the largest Kubernetes clusters in the world – Kubernetes since January 2016 – 30,000 physical servers, 500,000 containers – Multiple clusters across geo-distributed data centers, max cluster with 9000 nodes • CNCF Platinum Member

4.JD Container Platform Business Thailand Indonesia 7FRESH TOFLIFE Intelligent Operations Management Big Application Online Services AI IoT Data Middleware Storage Service Elastic Compute Service Platform Service &Platform JDOS (Jingdong Datacenter OS) OS Containerized Infrastructure Infrastructure Geographically Distributed Datacenters Retail Stores

5.Overview 1 2 Function as a Service (FaaS) 3 Fibonacci FaaS Platform 4 Use Cases 5 Conclusions

6.Function as a Service

7. ) ( • A form of serverless computing • Function (small unit of work) – Development, deployment, maintenance, operation, monitoring, resource scaling • Event-driven execution

8.FaaS at JD: Why? Diverse use case demands :simplified development and automated deployment & management Web Backend AI and Big Data IoT Scheduled Jobs Data Processing

9.FaaS at JD: Why? Containerized Platform and Ecosystem Fine-grained Elasticity for - Resource Efficiency function BB BB BB -2 1 1 00 - Containerized Server 1

10.FaaS at JD: Why? Complexity Service: Productivity • Simplification • Automation Infrastructure: Efficiency • Fine-grained resource management • Elastic scaling Utilization

11.Fibonacci: JD’s FaaS Platform

12.Fibonacci: From Paas To FaaS JD’s enterprise-grade FaaS platform • Usability: Enhanced functionality and features • Simplicity: Customized templates and ecosystem integration for multiple use cases • Efficiency: Optimized elastic scheduling Developer Code Function No need to Event triggered Elastic resource Testing Upload Elastic manage servers execution scheduling Execution Function Platform Trigger User API trigger Easy and flexible Support multi JDOS Platform development language Event trigger


14.Function Triggers • Request dispatching • Runtime data collection • Access control • Extensibility: DNS, log

15.Function Execution • Request to input, output to Watcher response conversions /function/ handler.py • Function call • Process and thread Watcher invocations for different cases /function/ handler.py

16.Function Watcher • Innovation logic Function invocation • Health check: monitoring data Heartbeat detection collection, heartbeat, status, etc. Data collection Status • Hot update: container volumes, Hot replacement environment variables • Update methods • Container configuration & environment variables • Directory, volume, extremal storage

17.Function Container Multiple methods for deployment and update • Function templates • Efficient • Directory, volume & extremal storage • Smaller images • Update without image rebuilding • Container configuration & environment variables • No image rebuilding • No re-deployment • Rapid update

18.Elastic Scheduling • Fine-granularity monitoring & logging QPS Timeout • Demand-based elastic scaling Memory invoke Gateway Function’ QPS Exec time Pull Failure Pull CPU Memory Replicas dfs Rule A Rule B Prometheus Set Replicas Rule C Push Push Grouping Function Inhibition Scale Mgmt Sliences Mgmt eg:sum by(function_name) (rate(function_invoke_total{code="200",rate="50"}[10s])) / 50 /0.85> sum(function_service_count) by (function_name)

19.Scheduling Optimization • Multiple pod pools (min and max sizes) • Hot deployment: configuration update and • Round-robin scheduling function push • Hot deployment for latency-sensitive • Update label to link/remove a function functions to/from K8S service • Cold deployment for latency-tolerant • Ready status control functions

20.Function Template Customization Dynamic Static Dependency Dependency Minimum Watcher Runtime Function Service Core APIs Configuration Function Template Customization Startup

21. Code Deployment Publish f1 • Function Code Scanning Publish f2 – Java, Python Developer Source Code Fibonacci Scanner • Real time monitoring and alarming – FIM, port scanning • Access Control Call f1 Control Access – Token-based authentication & Call f2 Function authorization Alarm Service Users • Network Traffic Monitor – DDOS attack f1 f2 – Network traffic analysis Security Agent Function Instances Network Traffic Monitor

22.Fibonacci Platform Summary Complete FaaS capabilities: development, deployment, operation, and maintenance Enhanced features and functionalities • Function lifecycle management • Customized function templates • Multi-language support with extensibility: Python, Java, Node.js, Golang • Multi-dimension monitoring and visualization Seamless integration with JD’s container platform • JDOS – JD’s Kubernetes • Triggers: http, JSF gateway, timer, JMQ/Kafka • Storage: JIMDB, CFS, Vitess database Performance optimization and elastic scheduling Security

23.Use Cases

24.Use Case 1: JD Intelligent Speaker Developer Platform Third-party application development, deployment and management natural language text voice text user loudspeaker box text text template choose deploy template produce developer function test template Development

25.• Machine learning templates • Storage integration

26.Use Case 3: JD Mobile Content Services • Monolithic applications to FaaS based lightweight services • Elastic scheduling QPS Fibonacci QPS APP Server

27.Fibonacci: JD’s enterprise-grade FaaS platform • Complete and enhanced FaaS capabilities for more efficient development, deployment and management of applications and services. • Optimized elastic resource management for improved IT resource efficiency. Challenges • Use cases: barriers to adopting FaaS • Performance and scalability • Building an ecosystem • The support of Java

28.Open Innovation Open Source Academia Industry JD

29.Thank you! Contacts: Yuan Chen (yuan.chen@jd.com, Wechat: yuan_gt) Xin Tong (tongxin5@jd.com, Wechat: shangshant)