-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Description
What feature you would like to be added?
- Enable spark operator watch spark application and pods by
labels - Allow the leader pod to distribute the range of labels that each pod needs to watch.
- When a pod in a Deployment updated (scale in / scale out), the leader pod can automatically redistribute the label ranges each pod is responsible for.
If this is needed. I can take this task and contribute some of our engineering experience to the community.
Why is this needed?
In our k8s production environment. Different teams dynamic create / delete their own namespace for spark application.
At peak times, there could be over 1,000 Spark applications in Kubernetes, distributed across different namespaces unevenly.
I hope to introduce a solution that uses labels to inform them, enabling pods in the deployment to handle tasks more effectively and enhancing elasticity.
Describe the solution you would like
- Our approach is to automatically assign a label to each SparkApplication upon creation, with the label value ranging from 0 to 100—for example, spark-operator-label: 99.
- After being elected as the leader, the leader node periodically checks how many live pods exist under the current deployment and distributes, via a ConfigMap, the range of labels each worker pod should monitor. For example, pod1 is responsible for labels 0–10, pod2 for 11–20, and so on.
- The leader node itself needs to watch for changes to the operators in the deployment, so it can dynamically adjust the ConfigMap accordingly.
- Worker pods need to watch for changes to the ConfigMap and periodically check whether their assigned label range has been updated, to avoid missing any watch targets during scale-down events.
- With this mechanism, we can horizontally scale the Spark Operator deployment— as long as the number of pods hasn't reached the upper limit of the label range, we can keep scaling up.
graph TD
A[Create SparkApplication with Label] --> B{Leader Election}
B -->|Leader Elected| C[Leader Node Checks Live Pods]
C --> D[Update ConfigMap with Label Ranges]
D --> E[Worker Pods Watch for Changes]
E --> F{Label Range Updated?}
F -->|Yes| G[Adjust Watching Labels]
F -->|No| H[Continue Monitoring]
C --> I[Monitor Operator Changes]
I --> J[Adjust ConfigMap if Necessary]
subgraph Horizontal Scaling
K[Check Pod Count < Label Range Upper Limit]
L[Scale Up/Down Deployment]
K -->|Can Scale Up| L
end
J --> K
G --> K
Describe alternatives you have considered
We also tried automatically deploying a Spark operator for each newly created namespace, but that approach wasn't easy to maintain. This is because we frequently upgrade and modify the operator ourselves, and without centralized control, we might encounter compatibility issues during upgrades.
Additional context
No response
Love this feature?
Give it a 👍 We prioritize the features with most 👍