Skip to content

Commit 30d6465

Browse files
authored
Superheroes automation article (#23)
Signed-off-by: Andrea Lamparelli <[email protected]>
1 parent b4fb3d3 commit 30d6465

File tree

1 file changed

+342
-0
lines changed

1 file changed

+342
-0
lines changed
Lines changed: 342 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,342 @@
1+
---
2+
title: "Taming Performance Testing with qDup"
3+
date: 2025-07-16T00:00:00Z
4+
categories: ['performance', 'methodology', 'CI/CD', 'automation']
5+
summary: 'This is the story of how we tamed cross-team collaboration performance testing by leveraging a tool called qDup to streamline complex test automation setups'
6+
image: 'superheroes-automation-banner.png'
7+
related: ['']
8+
authors:
9+
- Andrea Lamparelli
10+
---
11+
= Taming Performance Testing with qDup
12+
:icons: font
13+
14+
At scale, performance engineering hinges on reproducibility and maintainability.
15+
In large organizations, where multiple teams test the same product across varied environments,
16+
it’s like trying to play a symphony with different orchestras: unless everyone follows
17+
the same score, you'll never hear the same music.
18+
That was the challenge we faced when we kicked off the "superheroes-automation"
19+
footnote:automation[https://github.com/RedHatPerf/superheroes-automation]
20+
project, an ambitious cross-organization initiative involving both Red Hat and IBM.
21+
Our goal was simple, yet daunting: to create a fair and consistent performance testing
22+
setup for comparing different Java technologies. We needed a setup that was not only robust
23+
but also perfectly reproducible, ensuring that every person involved, regardless of their
24+
team or environment, would run the exact same tests in the exact same way. This is the
25+
story of how we tamed the performance testing beast and achieved true reproducibility
26+
by leveraging qDup footnote:qdup[https://github.com/Hyperfoil/qDup].
27+
28+
== The Challenge
29+
30+
Our initiative brought together talented engineers from across two organizations, but with
31+
that came a diverse array of local development environments, preferred tools, and ingrained
32+
testing habits. This diversity, which is usually a strength, quickly became our biggest hurdle.
33+
34+
35+
Our primary struggle in the early stages was the stark inability to reproduce the same results
36+
between teams. We would run what we thought was the same performance test, only to get
37+
frustratingly different numbers. With each team having their own scripts and manual steps to
38+
spin up this environment, subtle but significant variations were inevitable. Were the results
39+
different because of the technology we were testing, or because one team's database was
40+
configured slightly differently? The data was noisy, making any meaningful comparison impossible.
41+
We were spending more time debugging our setups than analyzing performance.
42+
43+
The inherent complexity of our chosen application, Quarkus Superheroes
44+
footnote:[https://github.com/quarkusio/quarkus-super-heroes], required an automated
45+
setup. This was the only way to guarantee a uniform configuration for every user and ensure the
46+
solution was portable enough to function reliably across diverse environments – ensuring everyone
47+
was actually testing the same thing.
48+
49+
== The Orchestrator of Our Performance Symphony
50+
51+
Faced with these challenges, we needed a powerful orchestrator. As the maintainers and developers
52+
behind qDup footnote:qdup[], we knew we had the perfect tool for the job. In fact, we built qDup
53+
precisely to solve these kinds of complex automation problems that rely heavily on shell scripting.
54+
55+
=== What is qDup?
56+
57+
At its heart, qDup is a powerful orchestrator for scripting. It’s not a new, proprietary programming
58+
language you have to learn. Instead, it takes the shell commands and scripts your teams already know
59+
and use, and it gives them superpowers. It allows you to structure your automation, manage states
60+
between different scripts, and coordinate their execution across multiple servers.
61+
62+
It is designed to follow the same workflow as a user at a terminal so that commands can be performed
63+
with or without qDup. Commands are grouped into re-usable scripts that are mapped to different hosts
64+
by roles. All defined using simple YAML files, very similar to Ansible playbooks.
65+
66+
qDup has 3 pre-defined phases for script execution to follow the usual performance test workflow:
67+
_setup_, _run_, and _cleanup_.
68+
69+
=== How is qDup solving the challenge?
70+
71+
We chose to use and continue to develop qDup because it is founded on principles that directly
72+
address the challenges of collaborative automation, especially in a bash-centric world:
73+
74+
* *It embraces bash; it doesn't replace it*: This is the most crucial design philosophy behind
75+
qDup. We know that most automation is built on the foundation of shell scripts. So, instead of
76+
forcing teams to learn a new language, qDup allows them to leverage their existing skills and
77+
scripts. This dramatically lowers the barrier to entry and accelerates adoption.
78+
79+
* *Declarative YAML as the single source of truth*: We believe automation workflows should be
80+
easy to read and version. By using YAML, we provide a declarative way to define the entire process.
81+
This file, when checked into Git, becomes the undeniable source of truth, ending any debate about
82+
how a test should be run. The workflows can be split into several files to improve readability
83+
and maintainability.
84+
85+
* *Guaranteed consistency is a core tenet*: The fundamental promise of qDup is to guarantee that
86+
every user runs the exact same commands in the exact same order. This isn't just a feature; it's
87+
the core reason the tool exists. It’s our definitive solution to the "it works on my machine" problem.
88+
89+
* *Orchestration is built-in, not an afterthought*: From day one, qDup was designed to handle
90+
multi-machine workflows. For a realistic, complex application like Quarkus Superheroes, this is
91+
non-negotiable. Defining roles (likely running on different machines) and coordinating tasks/scripts
92+
between them is a native feature, not a bolted-on hack.
93+
94+
Ultimately, we built qDup to transform complex, error-prone manual setups into a single, reliable command.
95+
Turning a page-long README file into “qdup qdup.yaml” is precisely the empowerment we aim to provide
96+
to developers and testers.
97+
98+
== Our qDup Automation
99+
100+
Theory is great, but the real test is in the implementation. Adopting qDup wasn't just about choosing
101+
a tool; it was about embracing a new, structured way of thinking about our automation, built on the
102+
back of simple, powerful bash scripts.
103+
104+
We structured our entire performance testing workflow around qDup's three distinct phases. This brought
105+
a clean and predictable order to our process, making it easy for anyone on the team to understand.
106+
107+
[start=1]
108+
. *Setup Phase*: This was the workhorse. Before any test could run, this phase would execute a series
109+
of our bash scripts to:
110+
111+
.. Build the correct Quarkus Superheroes (either native, OpenJDK, or Semeru Runtimes depending on the test).
112+
This is optional depending on the test, as it can also consume published artifacts.
113+
114+
.. Start all the necessary services for the Quarkus app to run, e.g., databases, registries, etc.
115+
116+
.. After that, startup all the Quarkus Superheroes microservices in the correct dependencies order
117+
118+
. *Run Phase*: With everything perfectly in place, this phase had one job: execute a performance test
119+
and monitor the running application, i.e., what we will refer to System Under Test or SUT.
120+
121+
.. A role would be responsible to trigger our Hyperfoil benchmark against the SUT; the specific benchmark
122+
configuration could be provided as parameter of the execution.
123+
124+
.. A different role is responsible for starting up all profiling tools that are meant to monitor the
125+
SUT behavior, i.e., capturing additional data for further analysis like CPU usages, memory footprints,
126+
flamegraphs, etc.
127+
128+
. *Cleanup Phase*: Just as important as the setup, this phase would gracefully stop the application,
129+
shut down the database, and—most critically—run scripts to collect all the necessary logs and performance
130+
metrics from the various machines. It conveniently places all results in a local directory, ready for
131+
immediate analysis.
132+
133+
The overall automation was implemented allowing all services were sufficiently isolated among each other,
134+
e.g., ensuring that load drivers and data sources would have run on different machines with respect to
135+
the SUT, to obtain reliable results and avoid that under tools would affect the SUT performances.
136+
137+
=== A glimpse into the configuration
138+
139+
The entry point of the superheroes automation is the root "qdup.yaml" file, which defines which scripts
140+
to run, where and when to run them.
141+
142+
143+
[source,yaml]
144+
----
145+
hosts:
146+
sut: ${{SUT_SERVER}}
147+
datasource: ${{DS_SERVER}}
148+
driver: ${{LOAD_DRIVER_SERVER}}
149+
150+
roles:
151+
datasource:
152+
hosts:
153+
- datasource
154+
setup-scripts:
155+
- infer-datasource-hostnames
156+
- prepare-superheroes
157+
- start-jaeger
158+
- start-otel
159+
- start-heroes-db
160+
- start-villains-db
161+
- start-locations-db
162+
- start-fights-db
163+
- start-apicurio
164+
- start-kafka
165+
cleanup-scripts:
166+
- cleanup-datasources
167+
168+
sut:
169+
hosts:
170+
- sut
171+
setup-scripts:
172+
- start-jit-server
173+
- prepare-images # should be exposed by script files in /modes folder
174+
- infer-datasource-hostnames
175+
- infer-services-hostnames
176+
- start-heroes-rest
177+
- start-villains-rest
178+
- start-locations-grpc
179+
- start-fights-rest
180+
- start-fights-ui
181+
cleanup-scripts:
182+
- cleanup-superheroes
183+
184+
# all these scripts must be exposed by script files in /drivers folder
185+
driver:
186+
hosts:
187+
- driver
188+
setup-scripts:
189+
- setup-driver
190+
run-scripts:
191+
- run-benchmark
192+
cleanup-scripts:
193+
- cleanup-driver
194+
195+
profiler:
196+
hosts:
197+
- sut
198+
setup-scripts:
199+
- app-prof-setup
200+
run-scripts:
201+
- run-pidstat
202+
- run-vmstat
203+
- run-mpstat
204+
- run-pmap
205+
- run-strace
206+
- run-perfstat
207+
cleanup-scripts:
208+
- export-metrics
209+
- cleanup-profiling
210+
----
211+
212+
Let's break down our qDup script definition file:
213+
214+
The `hosts` section defines the logical names for the physical or virtual machines involved in our test. We use variables like `${{SUT_SERVER}}` so we can easily point to different servers without changing the script itself.
215+
216+
* `sut`: This is our System Under Test, the application we are benchmarking, i.e., the superheroes services.
217+
218+
* `datasource`: This machine hosts all our backend dependencies, like databases and message queues.
219+
220+
* `driver`: This is where the load driver is executed, this is required to ensure the load driver itself won't affect the SUT performances.
221+
222+
The `roles` section describes the responsibilities of each component in the test. Each `role` is assigned to one or more hosts and has scripts defined for three distinct phases: `setup`, `run`, and `cleanup` which are executed exactly in this order. The big difference among these 3 phases is how the scripts are executed, in `setup` and `cleanup` phases all the scripts are executed sequentially in the same order as they are specified. Whereas scripts defined in the `run` phase are all executed asynchronously, this means that all of them are started at the same time.
223+
224+
When you have to deal with multiple concurrent scripts, running for instance on different hosts it could become very tricky to coordinate them. This is when qDup comes in providing a very great feature: **signals**. Signals are a way to coordinated different scripts, e.g., you can use the `wait-for: MY_SIGNAL` command to block the execution of such script waiting for another script to raise that signal, e.g., with `signal: MY_SIGNAL`.
225+
226+
The **datasource** role
227+
228+
This role runs on the datasource host and is responsible for setting up the entire backend infrastructure needed by our application.
229+
230+
* setup-scripts: This is a comprehensive list of tasks that brings our backend to life. It starts observability tools like Jaeger and OpenTelemetry, multiple databases (heroes, villains, locations, fights), a Kafka message broker, and an Apicurio schema registry. All these backend datasources are required to properly start the Superheroes application.
231+
232+
* cleanup-scripts: After the test, it runs a single script to cleanly shut down and remove all the services it started.
233+
234+
The **sut** role
235+
236+
This role, assigned to the `sut` host, manages the application we are actually testing.
237+
238+
* setup-scripts: These scripts prepare and launch our microservices application. This includes starting REST APIs for heroes, villains, and fights, a gRPC service for locations, and a user interface (optional). Crucially, scripts like `infer-datasource-hostnames` ensure our application knows how to connect to the backend services set up by the `datasource` role.
239+
240+
* cleanup-scripts: Tears down all the application components.
241+
242+
243+
The **driver** role
244+
245+
This role runs on the `driver` host and is the active participant that executes the benchmark.
246+
247+
* setup-scripts: Prepares the load generation tools and environment.
248+
249+
* run-scripts: This is the heart of the test. The `run-benchmark` script is executed during the "run" phase of the qDup lifecycle, running the specified/provided benchmark (e.g., using Hyperfoil footnote:[https://github.com/Hyperfoil/Hyperfoil] load driver).
250+
251+
* cleanup-scripts: Cleans up the driver machine after the test is complete.
252+
253+
The **profiler** role
254+
255+
This is a special role that runs on the same host as our `sut`. Its sole purpose is to gather detailed performance metrics directly from the application server while the test is running.
256+
257+
* setup-scripts: Prepares the necessary profiling tools on the sut machine.
258+
259+
* run-scripts: These scripts run in parallel with the driver's run-benchmark script. We use a suite of powerful Linux utilities like `pidstat`, `vmstat`, `mpstat`, and `perfstat` to capture CPU, memory, I/O, and process-level activity as well as `async-profiler` for Java application profiling.
260+
261+
* cleanup-scripts: Once the benchmark is over, these scripts collate all the collected data into an `export-metrics` step and then clean up the profiling tools.
262+
263+
=== How do you run this?
264+
265+
Running `qDup` is as easy as executing a Java JAR file
266+
267+
[source,bash]
268+
----
269+
$ java -jar /path/to/qDup-0.9.0-uber.jar [... all files] qdup.yaml
270+
----
271+
272+
You just need to include all files you need to properly execute the automation.
273+
The tool will complain if something is missing, e.g., you are referencing a script
274+
that is not defined anywhere.
275+
276+
The process of running `qDup` has been simplified further by making use of jbang
277+
footnote:[https://www.jbang.dev/], a tool that let's you run self-contained source-only
278+
Java programs with unprecedented ease.
279+
280+
You don't need to download the uber JAR anymore, simply run the following command
281+
and jbang will take care of downloading whatever it needs to properly run `qDup`.
282+
283+
----
284+
$ jbang qDup@hyperfoil [... all files] qdup.yaml
285+
----
286+
287+
== From Performance Pains to Gains
288+
289+
Adopting `qDup` was about more than just implementing a new tool; it was about transforming our entire
290+
approach to performance testing. The "pains" of our initial phase, filled with inconsistent data
291+
and setup friction, quickly turned into significant "gains" for the project and everyone involved.
292+
293+
294+
=== Finally, truly reproducible results
295+
296+
The single most important outcome was achieving our primary goal: *consistent and comparable performance
297+
test results*. The "noise" created by dozens of different manual setups was gone. When we saw a difference
298+
in the numbers, we knew it was because of the technology we were testing, not because someone had a
299+
different configuration. We could now confidently compare the performance of the native Quarkus Superheroes
300+
application against its JVM counterpart, knowing we were looking at a true signal, not random variations.
301+
The data was clean, reliable, and trustworthy.
302+
303+
=== Collaboration reimagined
304+
305+
With a standardized setup defined in our Git repository, the dynamic between teams shifted dramatically.
306+
The conversations were no longer about "why my results are different from yours." Instead, they asked
307+
"how can we improve our shared testing process?"
308+
309+
If someone wanted to tweak a benchmark or add a new setup step, they would simply submit a pull request
310+
against the superheroes-automation repository. This made collaboration transparent and efficient.
311+
We were no longer debugging individual environments; we were collectively improving a shared, automated
312+
asset that benefited everyone.
313+
314+
=== A new level of confidence
315+
316+
Ultimately, qDup gave us unwavering confidence in our test results. Every number we presented in our
317+
findings was backed by an automated, version-controlled, and highly consistent environment. We weren't just
318+
sharing data; we were sharing data with a clear and verifiable history. This level of rigor meant we could
319+
stand behind our conclusions with certainty when making strategic recommendations based on performance
320+
outcomes. It’s the difference between guessing and knowing.
321+
322+
323+
== Conclusion & Future Works
324+
325+
The journey to takle our cross-organizational performance testing was a formidable challenge, born from the
326+
chaos of inconsistent environments and irreproducible results. By embracing qDup, we did more than just adopt
327+
a new tool; we adopted a new philosophy for automation. We transformed a complex, error-prone manual process
328+
into a single, reliable command that delivered the consistency we desperately needed.
329+
330+
To summarize our key achievements:
331+
332+
* We achieved true reproducibility: By establishing a single, version-controlled source of truth, qDup eliminated the "it works on my machine" problem. We can now trust our data, knowing that any performance variation comes from the technology under test, not the setup.
333+
334+
* We transformed collaboration: The conversation shifted from debugging individual setups to collectively improving a shared asset. The pull request model for automation changes fostered a transparent and efficient workflow, making every team member a stakeholder in the quality of our testing.
335+
336+
* We gained unwavering confidence: With a robust and consistent foundation, we can now stand behind our performance numbers with certainty, enabling us to make informed, data-driven decisions.
337+
338+
While our current setup has proven immensely successful, our work is far from over. We are committed to refining this project and empowering others to achieve the same results. Our future efforts will likely focus on:
339+
340+
* Improving documentation: We plan to enhance our documentation with detailed guides and tutorials. Our goal is to make it easier for new teams and contributors to understand our automation patterns and get up and running quickly, lowering the barrier to entry for robust performance testing.
341+
342+
* Publishing reusable scripts: Many of the scripts we've written for tasks like setting up databases, configuring profilers, or launching services are not specific to the Superheroes application. We intend to extract these common, battle-tested scripts into a shareable library. By making these components available to everyone, we hope to reduce duplication and allow others to assemble their own complex qDup workflows with greater speed and reliability.

0 commit comments

Comments
 (0)