-
Notifications
You must be signed in to change notification settings - Fork 24
Tolerate Machine ProviderID to support slow Infrastructure Providers #51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tolerate Machine ProviderID to support slow Infrastructure Providers #51
Conversation
|
/cc @elmiko |
|
thanks @Mmduh-483 , i am going to need some time to review this. |
elmiko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm still reviewing this, and i would like to test it out locally.
@Mmduh-483 i'm curious have you run this with providers that take more than a minute to return a provider id?
i'm curious how well karpenter tolerates adding the provider id later in the creation cycle.
pkg/cloudprovider/cloudprovider.go
Outdated
| // since we have a Machine, we should be reducing the replicas and annotating the Machine for deletion. | ||
| return nil, fmt.Errorf("cannot satisfy create, unable to label Machine %q as a member: %w", machine.Name, err) | ||
| if machine.Spec.ProviderID == nil { | ||
| return nil, fmt.Errorf("cannot satisfy create, waiting Machine %q to have ProviderID", machine.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor nit
| return nil, fmt.Errorf("cannot satisfy create, waiting Machine %q to have ProviderID", machine.Name) | |
| return nil, fmt.Errorf("cannot satisfy create, waiting for Machine %q to have ProviderID", machine.Name) |
The Kubernetes infrastructure is provisioned using KubeKey, which assigns the providerID to each node after the successful installation and running of Kubernetes on the node. The Karpenter lifecycle reconciliation for NodeClaim logic initiates a call to the CloudProvider's If the provisioned machine does not immediately possess a providerID, Karpenter logs a transitional error indicating it is waiting for this identifier. Should the Create function return an error, Karpenter will automatically re-attempt the reconciliation of the NodeClaim. To maintain the necessary binding between the control plane object and the physical infrastructure, the associated Machine's name and namespace are applied as annotations to the NodeClaim resource. |
|
i appreciate the detailed explanation.
have you run this patch with the kubekey provider? how well does it work? |
yes
Autoscaling up/down is working, I did few rounds of testing. |
cc5d3ff to
f6d3c2d
Compare
|
awesome, thanks! i want to run a local test or two, should be able to finish by end of week. |
elmiko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this is making sense to me. using annotations on the nodeclaim is a novel approach to solving the async provider id.
one recommendation i have, could we combine the machine and namespace annotation into a single annotation. something like cluster-api.kubernetes.io/machine-with-ns: <machine namespace>/<machine name>. would there be any issues with this design?
maxcao13
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if the cloud provider never sets a providerID on the machine? I have not tried this, but I'm assuming that the NodeClaim would never become registered and we Karpenter just retries again. But I am not sure what happens to the Machine after. Does the karpenter-core code call cloudprovider.Delete() eventually if this happens? Is this what you observed?
pkg/cloudprovider/cloudprovider.go
Outdated
| if machine.Spec.ProviderID == nil { | ||
| nodeClaim.Annotations[machineAnnotation] = machine.Name | ||
| nodeClaim.Annotations[machineNamespaceAnnotation] = machine.Namespace | ||
| if err = c.kubeClient.Update(ctx, nodeClaim); err != nil { | ||
| return nil, nil, fmt.Errorf("cannot satisfy create, unable to update NodeClaim annotaitons %q: %w", nodeClaim.Name, err) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this have to be conditional? Would there be any harm from always setting these annotations, even if the providerID has already been set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to be conditional, updated to have the annotations always
pkg/cloudprovider/cloudprovider.go
Outdated
| return []cloudprovider.RepairPolicy{} | ||
| } | ||
|
|
||
| func (c *CloudProvider) createOrGetMachine(ctx context.Context, nodeClaim *karpv1.NodeClaim) (*capiv1beta1.MachineDeployment, *capiv1beta1.Machine, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing to do with the implementation, but I think this might benefit from either a better name or a well worded comment above the function definition.
I think it's a little confusing to name this createOrGetMachine because it might give the impression that the function is calling some sort of caching mechanism where we create a Machine, if it doesn't already exist. But actually, the "get" is supposed to act as a status check for the asynchronous provisioning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to provisionMachine with comment
pkg/cloudprovider/cloudprovider.go
Outdated
| nodeClaim.Annotations[machineAnnotation] = machine.Name | ||
| nodeClaim.Annotations[machineNamespaceAnnotation] = machine.Namespace | ||
| if err = c.kubeClient.Update(ctx, nodeClaim); err != nil { | ||
| return nil, nil, fmt.Errorf("cannot satisfy create, unable to update NodeClaim annotaitons %q: %w", nodeClaim.Name, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
| return nil, nil, fmt.Errorf("cannot satisfy create, unable to update NodeClaim annotaitons %q: %w", nodeClaim.Name, err) | |
| return nil, nil, fmt.Errorf("cannot satisfy create, unable to update NodeClaim annotations %q: %w", nodeClaim.Name, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
Currently if CAPI karpenter didn't manage to get a Machine with ProviderID then it will keep retrying |
f6d3c2d to
09f5d40
Compare
elmiko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the updates @Mmduh-483 , i think we are close. i just noticed one minor change i would like, then i'm ok to approve.
pkg/cloudprovider/cloudprovider.go
Outdated
| taintsKey = "capacity.cluster-autoscaler.kubernetes.io/taints" | ||
| maxPodsKey = "capacity.cluster-autoscaler.kubernetes.io/maxPods" | ||
|
|
||
| machineAnnotation = "cluster-autoscaler.kubernetes.io/machine" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry to be a pain about this, i think we should use the cluster-api prefix for this:
| machineAnnotation = "cluster-autoscaler.kubernetes.io/machine" | |
| machineAnnotation = "cluster.x-k8s.io/machine" |
mostly because this annotation is not associated with the cluster-autoscaler. those other annotations have some history with the autoscaler as they originated there in the scale from zero enhancement for cluster-api. i kept the original prefixes on them so that we could share implementation details between karpenter and cluster-autoscaler. but for this machine annotation, we should make it clear that it is related to cluster-api.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
…oviders This change updates the CloudProvider reconciliation logic to allow the use of a newly created CAPI Machine even if it does not immediately have the ProviderID set. Currently, the reconciliation will revert the MachineDeployment's replica count and delete the Machine if ProviderID is not assigned within the timeout period. This is problematic for InfrastructureProviders that require significant time to set the ProviderID. This change allows the reconciliation to proceed with the Machine's creation, preventing premature deletion and supporting providers that have a slow time-to-providerID. The system will continue to wait for the Machine to eventually obtain a ProviderID. This enhances compatibility with slow-to-provision infrastructures. Signed-off-by: Mamduh Alassi <[email protected]>
09f5d40 to
b64a507
Compare
elmiko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for all the updates @Mmduh-483 , this is looking good to me. if @maxcao13 is also good with it, i'm happy to approve.
/lgtm
|
/lgtm thanks! |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: elmiko, Mmduh-483 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This change updates the CloudProvider reconciliation logic to allow the use of a newly created CAPI Machine even if it does not immediately have the ProviderID set.
Currently, the reconciliation will revert the MachineDeployment's replica count and delete the Machine if ProviderID is not assigned within the timeout period. This is problematic for InfrastructureProviders that require significant time to set the ProviderID.
This change allows the reconciliation to proceed with the Machine's creation, preventing premature deletion and supporting providers that have a slow time-to-providerID. The system will continue to wait for the Machine to eventually obtain a ProviderID. This enhances compatibility with slow-to-provision infrastructures.