Extension to the Kube-Scheduler mostly for the on-premise Kubernetes clusters
Introduction
When we use affinity rules for our deployment, we often prefer to use preferredDuringSchedulingIgnoredDuringExecution
. This rule is a soft requirement that attempts to allocate pods across different nodes. However, when only one worker node is in the Ready
state, all replicas can be allocated to the single available worker node.
This is where this CronJob
helps us. This script periodically checks the node and deployment status, and if necessary, performs a rollout restart to redistribute the pods across nodes. This tool is helpful when running deployments that should have their pods spread out over multiple nodes for redundancy.
Solution Architecture
The script is designed to run as a Kubernetes CronJob
that periodically performs checks and initiates deployment rollouts if necessary. The solution uses the Python Kubernetes client to interact with the cluster.
To enable the script for certain deployments, we use a namespace annotation (collectdeployments: allow
). This makes the script flexible and prevents unnecessary restarts on deployments that don't require this level of redundancy.
The solution includes a Dockerfile
for packaging the script, a Kubernetes CronJob
to periodically run the check, and RBAC
configurations to allow the script to interact with the Kubernetes API.
How it works
The restart_deployments_if_needed
the function is at the heart of our CronJob
. This function:
- Checks the number of
Ready
worker nodes. - If at least two nodes are ready, it fetches the list of namespaces with the
collectdeployments: allow
annotation. - It then fetches the deployments in these namespaces, and for each deployment, it checks if the deployment’s pods are spread across at least two nodes.
- If all replicas of a deployment are on a single node, it performs a rollout restart for that deployment.
All written code files are located in this repository.
This is a real simulation of how it works