Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubenetes

Have you ever wondered how to implement your own operator pattern for you service X in Kubernetes? You can learn this in this session and see an example of open-source project that does spawn Apache Spark clusters on Kubernetes and OpenShift following the pattern. You will leave this talk with a better understanding of how spark-on-k8s native scheduling mechanism can be leveraged and how you can wrap your own service into operator pattern not only in Go lang but also in Java. The pod with spark operator and optionally the spark clusters expose the metrics for Prometheus so it makes it eas.


1.WIFI SSID:Spark+AISummit | Password: UnifiedDataAnalytics

2.Spark Operator Deploy, Manage and Monitor Spark clusters on Kubernetes Jiri Kremser, Red Hat #UnifiedDataAnalytics #SparkAISummit

3.#UnifiedDataAnalytics #SparkAISummit 3

4. Pod Deployment Service StatefulSet ReplicationController Job #UnifiedDataAnalytics #SparkAISummit 4

5.Manifest Nightmares #UnifiedDataAnalytics #SparkAISummit 5

6.Operator Pattern • Extends Kubernetes • Resources and Controllers • Custom Resource Definitions (CRD) • Reacts on events when resource is CRUDed • Sometimes referred as Custom Controllers #UnifiedDataAnalytics #SparkAISummit 6

7.Operator<X> - example I am listening on CR<X> Operator K8s API CR<X> …. CustomResource representing the desired configuration of X #UnifiedDataAnalytics #SparkAISummit 7

8.Operator<X> - example OK, whatever ¯\_( ツ )_/¯ Operator K8s API #UnifiedDataAnalytics #SparkAISummit 8

9.Operator<X> - example Hey! New resource Operator K8s API #UnifiedDataAnalytics #SparkAISummit 9

10.Operator<X> - example Beep!Beep! Boop!Zzzz! ⚡⚡ Operator K8s API #UnifiedDataAnalytics #SparkAISummit 10

11.Comparison Operator can be seen merely as deployment mechanism, but it can do much more • Kubernetes manifests • Helm Chart • Ansible • Kustomize • Ksonnet #UnifiedDataAnalytics #SparkAISummit 11

12.Operator minimal example namespace=${WATCH_NAMESPACE:-default} base=http://localhost:8001 ns=namespaces/$namespace curl -N -s $base/api/v1/${ns}/configmaps?watch=true | \ while read -r event do # ... done #UnifiedDataAnalytics #SparkAISummit 12

13.Spark Operator • Started as toy project • Adopted by AI-CoE project • Compatible with Spark operator from Google to avoid vendor lock-in • Available also in or Helm chart or using ansible role #UnifiedDataAnalytics #SparkAISummit 13

14.Spark Operator Reacts on events from these custom resources: • SparkCluster • SparkApplication • SparkHistoryServer #UnifiedDataAnalytics #SparkAISummit 14

15.Spark Operator Reacts on events from these custom resources: • SparkCluster • SparkApplication • SparkHistoryServer Full schema captured by JSON schema #UnifiedDataAnalytics #SparkAISummit 15

16.#UnifiedDataAnalytics #SparkAISummit 16

17.#UnifiedDataAnalytics #SparkAISummit 17

18.Spark Operator Reacts on events from these custom resources: • SparkCluster • SparkApplication • SparkHistoryServer #UnifiedDataAnalytics #SparkAISummit 18

19.Fabric8 Kubernetes client Fluent API Type-safety Takes the credentials from: • kube config file • service account token & mounted CA cert #UnifiedDataAnalytics #SparkAISummit 19

20.Abstract Operator Library • Automates the common tasks • User has to only extend the class and override couple of methods. • Supports JSON schema as the representation of the configuration. • CRDs and CMs supported #UnifiedDataAnalytics #SparkAISummit 20

21.Dependencies operator-parent-pom spark-operator abstract-operator kubernetes-client depends on has parent #UnifiedDataAnalytics #SparkAISummit 21

22.Tooling • Soit – Python CLI that verifies if container image is “operator compliant” • Ansible role – it supports also deploying Prometheus together with the operator • Oshinko-temaki – CLI that produces valid yamls with custom resources for the operator All the tools are available in the readme file #UnifiedDataAnalytics #SparkAISummit 22

23.Metrics • Endpoints for Prometheus • Operator metrics (including JVM metrics) • Metrics from deployed Spark clusters #UnifiedDataAnalytics #SparkAISummit 23

24. 2 #UnifiedDataAnalytics #SparkAISummit 24

25.Takeaways • Spark on K8s can be easy • Operator can hide complexity • Operators can be done in any language • Hopefully in Spark: #UnifiedDataAnalytics #SparkAISummit 25