This repository is for setting up load testing environment on GKE with terraform.
- gcloud >= Google Cloud SDK 349.0.0
- kubernetes-cli >= 1.22.1
- terraform >= 1.0.5
- python >= 3.9 (To generate diagram)
Copy Makefile.example and fill out attributes below:
| value | description |
|---|---|
| PROJECT_ID | GCP Project ID |
| CLUSTER_NAME | Cluster base name. Due to the cluster deletion takes time, this tool add a random texts at the end of the base cluster name |
| REGION | GCP Region name |
| ZONE | GCP Zone name |
| MACHINE_TYPE | Machine type of loading machines. Please see machine types for more details |
| CREDENTIALS | The full path to the Service Account JSON file. |
| SERVICE_ACCOUNT_EMAIL | Service Account Email. Eg. [User name]@[Project name].iam.gserviceaccount.com |
| TARGET_HOST | Target host URL |
-
Navigate to
deployfolder.make init_allto set up
terraform -
Run
make buildto set up a GKE cluster and initialize and
gcloudcommand pointing to the created GKE cluster. -
Run
make a_locustto set up
locustand required config maps (storing load test scripts) for performance testing. -
Run
make locustThis will do port forwarding to the local. Then you can access to
Locust Masterwithlocalhost:8089. -
Stop
make locustand Runmake refreshThis will refresh the Locust Cluster with updated
main.pyscript file andvalues.yamlcontent. Once the Locust Cluster up and running, connect the master withmake locust
Run
make d_all
At each load testing scripts update, workers need to be redeployed to read the latest config maps where testing scripts are stored according to the Kubernetes specification. This way allows you to update with one command.
- If you are already connecting the load cluster with
make locust, Ctrl+C to stop it. - All code is stored under
locustdirectory.main.pyis the main logic, and libraries are under thelibdirectory. - Once code is updated, run
to reload
make refreshConfigMapand Locust clusters to read the updated config map. - Run
make locustagain to connect the load cluster.
To generate the load at a lower cost, you may want to use as few workers as possible. This is a sample step on how to adjust the number of users and workers appropriately.
In the case of generating 10000 RPS, here are the steps that I tried.
- Enable HPA, start from 10 workers with 2000 users, and see how much load the Locust cluster can generate. In this case, Locust generated 3000 RPS and saturated there. No CPU errors are observed in Cloud Logging, which implies CPU is still not pushed to the limit.
- Assuming 3 times more users would generate 10000 RPS. Change users to 6000 and run
make refreshto restoreConfigMapand Locust pods. - You observed workers automatically scaled to 15 and the load reached higher than 10000 RPS.
- Adjust the initial worker to
15in thevalues.yamlandmake refreshto update the Locust pods.
In the case where you use spike_load.py to generate 10000RPS with the Locust Cluster on GKE, here is the reference configuration.
spike_load.py hatches users at once and hold requests until all users are spawned in each worker (not across all workers).
| parameters | description |
|---|---|
Machine type of locust worker (MACHINE_TYPE in Makefile) |
e2-standard-2 |
Replicas for worker (line 66 of values.yaml) |
15 |
User amount (line 15 of spike_load.ph, user_amount) |
10000 |
With this settings,
- The first second RPS is around 600
- It'll reach 10000RPS in 15 to 20 seconds, and go higher. You may want to pace the access with
constant_pacingfunction if you exactly target 10000RPS and dwell (stay) for a while.
In spike_load.py, the below line configure the dwell load time. This code means dwell 120 seconds with amount of user_amount users. Adust dwell time accordingly.
targets_with_times = Step(user_amount, 120)You may want to iterate try and error quickly while building a testing script. Loading the testing script every time on GKE is quite troublesome. For the development phase, you can leverage Docker to run a small cluster locally.
Spin up the small locust cluster, run
docker-compose up --build --scale worker=1
and you can access to the master from localhost:8089
Locust stops with exceptions when syntax errors are included in the loading script. For a faster turnaround, you may want to make sure the script works correctly at the local first and move on to the production.
Run make help
- Go to GCP console >
Services & Ingress - Open
locust-cluster, scroll down toPorts - Click
PORT FORWARDINGbutton ofmaster-p3, with port8089row - A dialog will be popped up and displays the port forwarding code in there. Copy & Paste onto the terminal, and run.
- You can access the
locust-clustermaster pod withlocalhost:8080from your browser.
This can be done just run make build, but also separately as below:
-
Build cluster with
make build_cluster -
Run
make gcloud_initThis command will configure your
gcloudenvironment pointing to the newly created GKE cluster.
- Install
Diagramsfollowing this step. - Go to
docsdirectory and runpython diagram.py
Autoscaling is depending on Kubernetes's Horizontal Pod Autoscaler(HPA). To enable HPA, Kubernetes manifest needs to include resource to sepecify the pod's resource allocation so that Kubernetes can manage the pods based on the CPU usage.
