feat: treat timeouts as failure or success based on config #2742

eduardorittner · 2025-11-05T12:54:35Z

Adds a new configuration option inside "slow-timeout": "on-timeout" which when set to "pass" makes nextest treat timed out tests as passing.

This PR adds a new SlowTimeoutResult struct similar to LeakTimeoutResult which characterizes a timeout as passing or failing.

There are some things which need to be addressed, and most of them are marked with TODOs in the code, these should all be fixed before merging. Most of them have to do with how to display this information, since we now have a new category of tests to display: tests that passed with a timeout. Any opinions here would be great!
There are some failing tests due to the formatting changes made, I decided not to accept the snapshots just yet since I'm not sure this is the final formatting we're going to end up with.

P.S.: I noticed nextest follows atomic commits, I figured only one commit would be enough for this change but I can split it up into smaller commits if it's deemed better.

Closes #1291

eduardorittner · 2025-11-05T12:59:56Z

Just confirmed that the failing tests in CI are these 3, which are snapshot tests for the formatting output. So all other tests are passing

nextest-runner reporter::displayer::imp::tests::test_info_response
nextest-runner reporter::displayer::imp::tests::test_summary_line
nextest-runner reporter::displayer::progress::tests::progress_str_snapshots

codecov · 2025-11-05T13:06:30Z

Codecov Report

❌ Patch coverage is 72.00000% with 14 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.84%. Comparing base (95e058f) to head (42f92f1).
⚠️ Report is 20 commits behind head on main.

Files with missing lines	Patch %	Lines
nextest-runner/src/reporter/aggregator/junit.rs	0.00%	6 Missing ⚠️
nextest-runner/src/reporter/displayer/imp.rs	61.53%	5 Missing ⚠️
nextest-runner/src/reporter/structured/libtest.rs	0.00%	2 Missing ⚠️
nextest-runner/src/reporter/events.rs	80.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2742      +/-   ##
==========================================
+ Coverage   77.41%   77.84%   +0.43%     
==========================================
  Files         112      112              
  Lines       26155    26205      +50     
==========================================
+ Hits        20247    20400     +153     
+ Misses       5908     5805     -103

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

sunshowers

Thanks. One commit is fine for this change (though there's one particular thing in this PR that should be in a followup).

nextest-runner/src/config/elements/slow_timeout.rs

sunshowers · 2025-11-05T22:48:32Z

nextest-runner/src/config/elements/slow_timeout.rs

        period: far_future_duration(),
        terminate_after: None,
        grace_period: Duration::from_secs(10),
+        timeout_result: SlowTimeoutResult::Fail,


An inconsistency here -- we use SlowTimeoutResult::Fail here but default below. Let's pick one or the other -- I'm partial to using Fail and removing the Default impl, unless there's a good reason to retain it.

The only issue with not having a SlowTimeoutResult::default is that AFAICT (I'm not that familiar with serde) the #[serde(default)] line in SlowTimeout's definition will have to be something like #[serde(default = "default_result")] where default_result is a custom function that returns the default value for SlowTimeoutResult. At that point I think it's just better to just keep the derived default and change this usage from SlowTimeoutResult::Fail to default.

From

pub struct SlowTimeout { // Omitted fields #[serde(default)] pub(crate) on_timeout: SlowTimeoutResult, }

To

pub struct SlowTimeout { // Omitted fields #[serde(default = "default_on_timeout")] pub(crate) on_timeout: SlowTimeoutResult, }

I just noticed something, since VERY_LARGE is const, I can't actually call SlowTimeoutResult::default() since it's not const, so now I'm unsure what to do:

Keep #[derive(Default)] in SlowTimeoutResult, use SlowTimeoutResult::default in non-const contexts and SlowTimeoutResult::Fail in const contexts.

Keep #[derive(Default)] just for #[serde(default)], use SlowTimeoutResult::Fail everywhere else

Remove #[derive(Default)], add a pub const fn default() -> Self which can be used everywhere. This does mean that instead of #[serde(default)] we'll have #[serde(default = "SlowTimeoutResult::default")] in SlowTimeout's definition.

Option 1 doesn't seem that great to me, I think I prefer option 3, but option 2 is also fine.

sunshowers · 2025-11-05T22:49:36Z

nextest-runner/src/config/elements/slow_timeout.rs

+    /// The test is marked as failed.
+    Fail,
+
+    /// The test is marked as passed.


I'd put in an explanation here because "the test is marked as passed" is pretty strange on a timeout being hit. Just talk about the fact that there are some use cases that benefit from being run up to a time limit.

What do you think of something like this?

/// The result of controlling slow timeout behavior. /// /// For most situations, a timed out test should be marked failing. However, there are certain /// classes of tests which are expected to run indefinetely long (or just too long), like fuzzing /// and property tests, which both explore a huge state space. For these tests it's nice to be able /// to treat a timeout as a success since, they generally check for invariants and other properties /// of the code under test. A timeout in this context doesn't necessarily mean that there are no /// failing inputs, just that they weren't found up until that moment, which is still valuable /// information. #[derive(Clone, Copy, Debug, Deserialize, PartialEq, Eq)] #[serde(rename_all = "kebab-case")] pub enum SlowTimeoutResult { /// The test is marked as failed. Fail, /// The test is marked as passed. Pass, }

Thanks. This is pretty good. Some notes:

It's "property-based tests" -- "property testing" has a different meaning.

Generally PBT tends to run for a limited number of iterations -- the confidence in them comes from the fact that they get run in CI all the time. So I'd drop the note about them, and instead just say fuzzing.

nextest-runner/src/reporter/displayer/imp.rs

nextest-runner/src/reporter/events.rs

sunshowers · 2025-11-05T22:57:33Z

nextest-runner/src/reporter/events.rs

+            ExecutionResult::Timeout {
+                result: SlowTimeoutResult::Fail,
+            } => {
+                self.failed += 1;


Note this is a logic change and makes the failed_count function incorrect. Let's address this in a followup rather than changing the semantics here.

I see, should I also move the changes to RunStats (adding passed_timed_out and renaming timed_out to failed_timed_out) into that followup? Or keep the struct changes in the first commit and address only the logic changes in a followup?

The struct changes have to go in the first commit, I think. The semantic changes to the meaning of things like failed should go in the followup.

Ah the failing tests are unfortunate -- can we split them up into another PR?

site/src/docs/configuration/reference.md

site/src/docs/features/slow-tests.md

Adds a new configuration option inside "slow-timeout": "on-timeout" which when set to "pass" makes nextest treat timed out tests as passing. The logic for actually counting how many timed out tests were treated as successes will be made in a followup commit.

eduardorittner · 2025-11-06T12:49:18Z

So I've split off the changes into two commits, where the second one only deals with the logic in reporter/events.rs for counting the number of failed tests, timed out tests, etc. Some tests are now also failing because I made self.failed include self.failed_timed_out, instead of them being disjoint, which broke a bunch of the tests that relied on this logic. I can revert this and make them disjoint again if that's better

site/src/docs/features/slow-tests.md

sunshowers · 2025-11-06T19:53:35Z

nextest-runner/src/reporter/events.rs

+            ExecutionResult::Timeout {
+                result: SlowTimeoutResult::Fail,
+            } => {
+                self.failed += 1;


Ah the failing tests are unfortunate -- can we split them up into another PR?

sunshowers · 2025-11-06T19:59:19Z

nextest-runner/src/runner/executor.rs

                                slow_timeout.grace_period,
                            ).await;
-                            status = Some(ExecutionResult::Timeout);
+                            status = Some(ExecutionResult::Timeout {result: slow_timeout.on_timeout});


rustfmt should have fixed this -- did that not work? (Oh maybe not because this is inside a tokio::select! arm, ugh.)

Would the "correct" formatting be this?

status = Some(ExecutionResult::Timeout { result: slow_timeout.on_timeout });

Yeah, either that or across multiple lines:

status = Some(ExecutionResult::Timeout { result: slow_timeout.on_timeout, });

Since timed out tests can now be treated as either successes or failures, update the logic for handling them. Note that before `self.failed` and `self.timed_out` were separate, but now `self.failed` includes `self.failed_timed_out`.

eduardorittner · 2025-11-10T13:03:35Z

Tests are now passing, I changed the logic in events.rs so that self.failed and self.failed_timed_out are disjoint and self.failed_count() is now

pub fn failed_count(&self) -> usize {
    self.failed + self.exec_failed + self.failed_timed_out
}

I haven't added any new tests, but I'll get started on that now that all the old existing tests are passing.

sunshowers · 2025-11-10T18:46:22Z

Looks great -- yeah, just missing new tests otherwise this is good to go.

eduardorittner · 2025-11-10T20:48:50Z

I have two questions regarding junit:

Should passing timed out tests be reported as failures? I ask because in reporter/aggregator/junit.rs passing tests that leaked are reported as NonSuccessKind::Error: https://github.com/eduardorittner/nextest/blob/42f92f1ed6c33eba1965f93bdb862548764892b3/nextest-runner/src/reporter/aggregator/junit.rs#L353-L358
How/where is junit support tested?

sunshowers · 2025-11-10T21:15:40Z

Good question. I think it should be marked as passed.
Uhmmm, there aren't good tests for deserializing JUnit reports. But there probably should be. I wouldn't worry about that, at some point I'll have Claude Code or something write those tests.

sunshowers · 2025-11-12T00:45:15Z

Uhmmm, there aren't good tests for deserializing JUnit reports. But there probably should be. I wouldn't worry about that, at some point I'll have Claude Code or something write those tests.

Did this in #2761.

sunshowers reviewed Nov 5, 2025

View reviewed changes

eduardorittner force-pushed the on-timeout branch 5 times, most recently from c2d5c73 to c4e6773 Compare November 6, 2025 12:26

eduardorittner force-pushed the on-timeout branch from c4e6773 to 5954546 Compare November 6, 2025 12:37

sunshowers reviewed Nov 6, 2025

View reviewed changes

feat: update timed out test counting logic

42f92f1

Since timed out tests can now be treated as either successes or failures, update the logic for handling them. Note that before `self.failed` and `self.timed_out` were separate, but now `self.failed` includes `self.failed_timed_out`.

eduardorittner force-pushed the on-timeout branch from 5954546 to 42f92f1 Compare November 10, 2025 12:35

Uh oh!

feat: treat timeouts as failure or success based on config #2742

Are you sure you want to change the base?

feat: treat timeouts as failure or success based on config #2742

Uh oh!

Conversation

eduardorittner commented Nov 5, 2025

Uh oh!

eduardorittner commented Nov 5, 2025

Uh oh!

codecov bot commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

sunshowers left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eduardorittner Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

eduardorittner commented Nov 6, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eduardorittner commented Nov 10, 2025

Uh oh!

sunshowers commented Nov 10, 2025

Uh oh!

eduardorittner commented Nov 10, 2025

Uh oh!

sunshowers commented Nov 10, 2025

Uh oh!

sunshowers commented Nov 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Nov 5, 2025 •

edited

Loading

eduardorittner Nov 6, 2025 •

edited

Loading