Skip to content

Conversation

@LiangquanLi930
Copy link
Contributor

@LiangquanLi930 LiangquanLi930 commented Oct 15, 2025

✨ Implement autoscaling from zero by auto-populating AWSMachineTemplate capacity and NodeInfo

What type of PR is this?
/kind feature

What this PR does / why we need it:
This PR implements the Cluster API autoscaling from zero proposal for CAPA by adding a controller that automatically populates AWSMachineTemplate.Status.Capacity with instance type information.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:

Checklist:

  • squashed commits
  • includes documentation
  • includes emoji in title
  • adds unit tests
  • adds or updates e2e tests

Release note:

Add autoscaling from zero support with auto-population of AWSMachineTemplate capacity/nodeInfo

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-priority labels Oct 15, 2025
@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 15, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @LiangquanLi930. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@LiangquanLi930 LiangquanLi930 changed the title ✨ Implement autoscaling from zero by auto-populating AWSMachineTemplate capacity WIP ✨ Implement autoscaling from zero by auto-populating AWSMachineTemplate capacity Oct 15, 2025
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 15, 2025
@elmiko
Copy link

elmiko commented Oct 15, 2025

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Oct 15, 2025
@LiangquanLi930 LiangquanLi930 force-pushed the opt-in-autoscaling-from-zero branch 7 times, most recently from f1ee365 to 3be8f4d Compare October 20, 2025 11:21
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 20, 2025
@LiangquanLi930 LiangquanLi930 force-pushed the opt-in-autoscaling-from-zero branch 2 times, most recently from 01be987 to 915f55b Compare October 20, 2025 12:57
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Oct 20, 2025
@LiangquanLi930 LiangquanLi930 force-pushed the opt-in-autoscaling-from-zero branch from 915f55b to b3850d1 Compare October 20, 2025 14:42
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 20, 2025
@LiangquanLi930 LiangquanLi930 force-pushed the opt-in-autoscaling-from-zero branch 2 times, most recently from 035c7f7 to 03683f1 Compare November 13, 2025 19:15
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Nov 13, 2025
@damdo
Copy link
Member

damdo commented Nov 15, 2025

/assign @nrb

…deInfo

Add AWSMachineTemplateReconciler to automatically populate capacity and node
info fields by querying AWS EC2 API. This completes the autoscaling from zero
implementation by ensuring the required metadata is available without manual
configuration.

Changes include:
- Add NodeInfo struct with Architecture and OperatingSystem fields to AWSMachineTemplate status
- Implement controller that queries EC2 API for instance type specifications
- Auto-populate CPU, memory, pods, and ephemeral storage capacity
- Auto-detect architecture (amd64/arm64) and OS (linux/windows) from AMI
- Add conversion logic for backward compatibility with v1beta1
- Enable status subresource on AWSMachineTemplate CRD
- Add comprehensive unit tests (351 lines) covering various scenarios
- Add RBAC permissions for controller operations

The controller automatically populates these fields when an AWSMachineTemplate
is created or updated, eliminating the need for manual configuration and
enabling Cluster Autoscaler to make informed scaling decisions from zero nodes.

Related: https://github.com/kubernetes-sigs/cluster-api/blob/main/docs/proposals/20210310-opt-in-autoscaling-from-zero.md

Squashed from 5 commits:
- 9a92a43 Implement autoscaling from zero by auto-populating AWSMachineTemplate capacity
- 86fe072 add AWSMachineTemplate NodeInfo
- ddaf62c Fix review comments
- 4ea52c8 Fix review comments 2
- b398ffc Fix review comments 3
@LiangquanLi930 LiangquanLi930 force-pushed the opt-in-autoscaling-from-zero branch from 03683f1 to ffdf7db Compare November 21, 2025 03:16
@LiangquanLi930
Copy link
Contributor Author

rebase #5720

@LiangquanLi930
Copy link
Contributor Author

/retest

@LiangquanLi930 LiangquanLi930 force-pushed the opt-in-autoscaling-from-zero branch from 8ec1d0b to 955bdda Compare November 21, 2025 06:47
@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Nov 21, 2025

@LiangquanLi930: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cluster-api-provider-aws-e2e 3bf0acb link false /test pull-cluster-api-provider-aws-e2e

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@damdo
Copy link
Member

damdo commented Nov 21, 2025

Hey @LiangquanLi930 can you squash the rebase commit and stack your ones on top? Thanks

@LiangquanLi930
Copy link
Contributor Author

Hey @LiangquanLi930 can you squash the rebase commit and stack your ones on top? Thanks

@damdo Hi, I’m keeping the commits separate for now to make rebasing and reviewing easier. Once you think everything looks good, I’ll squash the commits.

@LiangquanLi930 LiangquanLi930 force-pushed the opt-in-autoscaling-from-zero branch from 955bdda to 6493363 Compare November 21, 2025 07:53
@damdo damdo added this to the v2.10 milestone Nov 21, 2025
Add Conditions to AWSMachineTemplateStatus and update controller for CAPI v1.11
API changes.

Squashed from 2 commits:
- ffdf7db Fix review comments 4
- 6493363 rebase kubernetes-sigs#5720
@LiangquanLi930 LiangquanLi930 force-pushed the opt-in-autoscaling-from-zero branch from 6493363 to 9867d5a Compare November 21, 2025 10:44
@chrischdi
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 21, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 2d4c64dd7e397c439296d69123b29b2e14aff263

@nrb
Copy link
Contributor

nrb commented Nov 21, 2025

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: nrb

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 21, 2025
@k8s-ci-robot k8s-ci-robot merged commit 9ea47d0 into kubernetes-sigs:main Nov 21, 2025
21 checks passed
@LiangquanLi930 LiangquanLi930 changed the title ✨ feat: Implement autoscaling from zero by auto-populating AWSMachineTemplate capacity ✨ feat: Implement autoscaling from zero by auto-populating AWSMachineTemplate capacity/nodeInfo Nov 21, 2025
@richardcase
Copy link
Member

I know it approved but also had a look myself and i would 've also

/approve

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants