-
Notifications
You must be signed in to change notification settings - Fork 174
Add support for a new Pod Deletion Policy #4162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
261ff5a to
c3a9052
Compare
|
@FireDrunk
|
c3a9052 to
06b7705
Compare
06b7705 to
1bd9977
Compare
derekbit
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
@FireDrunk Can you sign off your commits? Thanks. |
1bd9977 to
a0b61e9
Compare
|
CI failed. Waiting for the fix by @FireDrunk |
|
|
||
| remountRequestDelayDuration = 5 * time.Second | ||
|
|
||
| podDeletionAnnotation = "longhorn.io/allow-deletion" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| podDeletionAnnotation = "longhorn.io/allow-deletion" | |
| podDeletionAnnotation = "longhorn.io/allow-pod-deletion-if-node-down" |
| NodeDownPodDeletionPolicyDeleteBothStatefulsetAndDeploymentPod = NodeDownPodDeletionPolicy("delete-both-statefulset-and-deployment-pod") | ||
| NodeDownPodDeletionPolicyDeleteBothStatefulsetAndDeploymentPodAndAnnotatedPods = NodeDownPodDeletionPolicy("delete-both-statefulset-and-deployment-pod-and-annotated-pods") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we retain one? Given that the new policy is essentially a superset of the current one, delete-both-statefulset-and-deployment-pod.
Additionally, we might consider implementing this resilience for all pods managed by controllers, rather than exclusively for SS and DS workloads, as we currently do for the setting auto-delete-pod-when-volume-detached-unexpectedly handling mentioned below. @derekbit @longhorn/maintainer WDYT?
/controller/kubernetes_pod_controller.go#L634-L636
if ownerRef := metav1.GetControllerOf(pod); ownerRef == nil || kc.isControllerInBlacklist(ownerRef) {
return nil
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, we might consider implementing this resilience for all pods managed by controllers, rather than exclusively for SS and DS workloads, as we currently do for the setting auto-delete-pod-when-volume-detached-unexpectedly handling mentioned below. @derekbit @longhorn/maintainer WDYT?
Agree with the proposal!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've checked, and both our CNPG Operator pods and our Strimzi PodSet pods both have that ownerReference correctly set, which makes me believe that using that GetControllerOf function would work properly.
Does that mean it's no longer useful to have the annotation for pods for which the ownerReference is not explicitly set, and only support properly owned pods?
I agree, but it's a bit of a reduction in functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this code change be OK?
SettingNameEnableDeletionOfControllerOwnedPods = SettingName("enable-deletion-of-controller-owned-pods")// Check if we are allowed to delete the pod regardless of its type because it's owned by a Controller.
deletePodsOwnedByControllers, err := kc.ds.GetSettingAsBool(types.SettingNameEnableDeletionOfControllerOwnedPods)
if err != nil {
return err
}
// Check if the pod has an owning Controller reference
shouldDeleteBecauseOwnedByController := false
if deletePodsOwnedByControllers {
if ownerRef := metav1.GetControllerOf(pod); ownerRef != nil && !kc.isControllerInBlacklist(ownerRef) {
shouldDeleteBecauseOwnedByController = true
}
}
// Check if the pod would be delted by a Policy
shouldDeleteBecauseOfPolicy := (deletionPolicy == types.NodeDownPodDeletionPolicyDeleteStatefulSetPod && isOwnedByStatefulSet(pod)) ||
(deletionPolicy == types.NodeDownPodDeletionPolicyDeleteDeploymentPod && isOwnedByDeployment(pod)) ||
(deletionPolicy == types.NodeDownPodDeletionPolicyDeleteBothStatefulsetAndDeploymentPod && (isOwnedByStatefulSet(pod) || isOwnedByDeployment(pod))) ||
(deletionPolicy == types.NodeDownPodDeletionPolicyDeleteBothStatefulsetAndDeploymentPodAndAnnotatedPods && (isOwnedByStatefulSet(pod) || isOwnedByDeployment(pod) || isAnnotatedWithDeletionAnnotation(pod)))
// If both checks fail to decide to delete the pod, don't do anything.
if !shouldDeleteBecauseOwnedByController && !shouldDeleteBecauseOfPolicy {
return nil
}
Which issue(s) this PR fixes:
Issue (longhorn/longhorn#11876)
What this PR does / why we need it:
This PR adds an extra PodDeletionPolicy called:
delete-both-statefulset-and-deployment-pod-and-annotated-pods(for lack of a better name...)It adds an additional option to set an Annotation on a Pod to mark it for explicit deletion.
This helps in cases where the Pod is not managed by a StatefulSet, Deployment or DaemonSet.
This is useful in situations where an Operator or Controller manages the Pod manually.
This requires Administrators to explicitly configure those annotations on the pod before longhorn engages in the deletion of the pod(s).
Special notes for your reviewer:
It's quite hard to test, we will have to setup a 3 node cluster, with a pod that has a pvc that is stuck in Terminating, and see if the longhorn-manager now correctly deletes the pod after it's grace period has expired.
Additional documentation or context
Docs PR here: longhorn/website#1210