-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Description
Which component are you using?:
/area vertical-pod-autoscaler
What version of the component are you using?:
Component version:
VPA 1.6.0
What environment is this in?:
Minikube cluster with kvm2 driver.
What did you expect to happen?:
The Updater shouldn't evict the pod during every update loop in this case.
What happened instead?:
When deploying a Kubernetes Deployment with a Pod that contains two containers, enabling VPA on it, and having a LimitRange object of type Pod in the cluster, the Updater keeps evicting the Pod in every update loop (i.e. every 1 mins), even when the recommended target remains the same as before.
How to reproduce it (as minimally and precisely as possible):
The recommender runs with --pod-recommendation-min-memory-mb=50 and InPlaceOrRecreate mode is turned on across all VPA components.
Furthermore the updater runs with --min-replicas=1 flag
apiVersion: apps/v1
kind: Deployment
metadata:
name: resource-consumer
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: resource-consumer
template:
metadata:
labels:
app: resource-consumer
spec:
containers:
- name: main
image: gcr.io/k8s-staging-e2e-test-images/resource-consumer:1.9
resources:
requests:
memory: 100Mi
limits:
memory: 200Mi
- name: sidecar1
image: alpine/curl
resources:
requests:
memory: 50Mi
limits:
memory: 100Mi
command: ["/bin/sh", "-c", "--"]
args:
- |
sleep 2
curl --data "megabytes=100&durationSec=1200" http://localhost:8080/ConsumeMem
while true; do
sleep 10
donekubectl apply -f - <<EOF
apiVersion: v1
kind: LimitRange
metadata:
name: podlimitrange
namespace: default
spec:
limits:
- type: Pod
max:
memory: "390Mi"
min:
memory: "20Mi"
EOF# to keep it simple, let's control only memory
kubectl apply -f - <<EOF
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: resource-consumer-vpa
namespace: default
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: resource-consumer
updatePolicy:
updateMode: 'Recreate'
resourcePolicy:
containerPolicies:
- containerName: "*"
controlledResources:
- "memory"
EOFAnything else we need to know?:
Let me try to break down why this issue is happening. First of all, I kept my pod running for a while so that the recommendations (Target, UpperBound and LowerBound) are more aligned with the "real" usage of the containers. The main container keeps using 100 MB of memory, while the sidecar1 container doesn't do anything, so it just gets the default minimums. Here are the recommendations in my case:
Container Recommendations:
Container Name: main
Lower Bound:
Memory: 126515454 # ~ 120.65 Mi
Target:
Memory: 126805489 # ~ 120.93 Mi
Uncapped Target:
Memory: 126805489
Upper Bound:
Memory: 272071920 # ~ 259.47 Mi
Container Name: sidecar1
Lower Bound:
Memory: 25Mi
Target:
Memory: 25Mi
Uncapped Target:
Memory: 25Mi
Upper Bound:
Memory: 25MiFor the above recommendations, this line returns the following PodPriority struct:
PodPriority {OutsideRecommendedRange: true, ScaleUp: false, ResourceDiff: 0}
When OutsideRecommendedRange is set to true, and the other conditions are skipped, the Updater continues adding the Pod to the UpdatePriorityCalculator, which eventually results in the Pod being evicted.
This occurs because the capProportionallyToPodLimitRange function (link) - which proportionally decreases the UpperBound, LowerBound, and Target values for each container when the constraints defined in a LimitRange object are violated - reduces the UpperBound for both the main and sidecar1 containers. Only the values defined in UpperBound are affected in this case, as they exceed the specified limits, because:
272 071 920×2 = 544 143 840 = 518.94 Mi
25Mi x 2 = 50 Mi
The sum of the UpperBound limits is 568.94 Mi (maintaining a 1:2 request-to-limit ratio based on the workload api), which exceeds the 390 Mi maximum specified in the LimitRange object. As a result, the UpperBound values are proportionally reduced to the following:
- main ~ 177,8 Mi
- sidecar1 ~ 17 Mi
After recalculating the limits, we get ~ 389 Mi, derived from (177.8 × 2) + (17 × 2). This total is below the maximum value specified in the LimitRange object.
With the new UpperBound values (main ≈ 177.8 Mi and sidecar1 ≈ 17 Mi), this line sets OutsideRecommendedRange to true because:
Container Name: sidecar1
Lower Bound:
Memory: 25Mi
Target:
Memory: 25Mi
Upper Bound:
Memory: 17Mi
# and the current resource request for the `sidecar1` container is 25 Mi (specified in the Pod specification).Based on this, the Updater continues evicting the Pod until the Recommender lowers the main container's UpperBound to a certain value, which may take some time.