Skip to content

Conversation

@eduardorittner
Copy link

Adds a new configuration option inside "slow-timeout": "on-timeout" which when set to "pass" makes nextest treat timed out tests as passing.

This PR adds a new SlowTimeoutResult struct similar to LeakTimeoutResult which characterizes a timeout as passing or failing.

There are some things which need to be addressed, and most of them are marked with TODOs in the code, these should all be fixed before merging. Most of them have to do with how to display this information, since we now have a new category of tests to display: tests that passed with a timeout. Any opinions here would be great!
There are some failing tests due to the formatting changes made, I decided not to accept the snapshots just yet since I'm not sure this is the final formatting we're going to end up with.

P.S.: I noticed nextest follows atomic commits, I figured only one commit would be enough for this change but I can split it up into smaller commits if it's deemed better.

Closes #1291

@eduardorittner
Copy link
Author

Just confirmed that the failing tests in CI are these 3, which are snapshot tests for the formatting output. So all other tests are passing

nextest-runner reporter::displayer::imp::tests::test_info_response
nextest-runner reporter::displayer::imp::tests::test_summary_line
nextest-runner reporter::displayer::progress::tests::progress_str_snapshots

@codecov
Copy link

codecov bot commented Nov 5, 2025

Codecov Report

❌ Patch coverage is 72.00000% with 14 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.84%. Comparing base (95e058f) to head (42f92f1).
⚠️ Report is 20 commits behind head on main.

Files with missing lines Patch % Lines
nextest-runner/src/reporter/aggregator/junit.rs 0.00% 6 Missing ⚠️
nextest-runner/src/reporter/displayer/imp.rs 61.53% 5 Missing ⚠️
nextest-runner/src/reporter/structured/libtest.rs 0.00% 2 Missing ⚠️
nextest-runner/src/reporter/events.rs 80.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2742      +/-   ##
==========================================
+ Coverage   77.41%   77.84%   +0.43%     
==========================================
  Files         112      112              
  Lines       26155    26205      +50     
==========================================
+ Hits        20247    20400     +153     
+ Misses       5908     5805     -103     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@sunshowers sunshowers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. One commit is fine for this change (though there's one particular thing in this PR that should be in a followup).

period: far_future_duration(),
terminate_after: None,
grace_period: Duration::from_secs(10),
timeout_result: SlowTimeoutResult::Fail,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An inconsistency here -- we use SlowTimeoutResult::Fail here but default below. Let's pick one or the other -- I'm partial to using Fail and removing the Default impl, unless there's a good reason to retain it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only issue with not having a SlowTimeoutResult::default is that AFAICT (I'm not that familiar with serde) the #[serde(default)] line in SlowTimeout's definition will have to be something like #[serde(default = "default_result")] where default_result is a custom function that returns the default value for SlowTimeoutResult. At that point I think it's just better to just keep the derived default and change this usage from SlowTimeoutResult::Fail to default.

From

pub struct SlowTimeout {
    // Omitted fields
    #[serde(default)]
    pub(crate) on_timeout: SlowTimeoutResult,
}

To

pub struct SlowTimeout {
    // Omitted fields
    #[serde(default = "default_on_timeout")]
    pub(crate) on_timeout: SlowTimeoutResult,
}

Copy link
Author

@eduardorittner eduardorittner Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed something, since VERY_LARGE is const, I can't actually call SlowTimeoutResult::default() since it's not const, so now I'm unsure what to do:

  1. Keep #[derive(Default)] in SlowTimeoutResult, use SlowTimeoutResult::default in non-const contexts and SlowTimeoutResult::Fail in const contexts.
  2. Keep #[derive(Default)] just for #[serde(default)], use SlowTimeoutResult::Fail everywhere else
  3. Remove #[derive(Default)], add a pub const fn default() -> Self which can be used everywhere. This does mean that instead of #[serde(default)] we'll have #[serde(default = "SlowTimeoutResult::default")] in SlowTimeout's definition.

Option 1 doesn't seem that great to me, I think I prefer option 3, but option 2 is also fine.

/// The test is marked as failed.
Fail,

/// The test is marked as passed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd put in an explanation here because "the test is marked as passed" is pretty strange on a timeout being hit. Just talk about the fact that there are some use cases that benefit from being run up to a time limit.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of something like this?

/// The result of controlling slow timeout behavior.
///
/// For most situations, a timed out test should be marked failing. However, there are certain
/// classes of tests which are expected to run indefinetely long (or just too long), like fuzzing
/// and property tests, which both explore a huge state space. For these tests it's nice to be able
/// to treat a timeout as a success since, they generally check for invariants and other properties
/// of the code under test. A timeout in this context doesn't necessarily mean that there are no
/// failing inputs, just that they weren't found up until that moment, which is still valuable
/// information.
#[derive(Clone, Copy, Debug, Deserialize, PartialEq, Eq)]
#[serde(rename_all = "kebab-case")]
pub enum SlowTimeoutResult {
    /// The test is marked as failed.
    Fail,

    /// The test is marked as passed.
    Pass,
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. This is pretty good. Some notes:

  • It's "property-based tests" -- "property testing" has a different meaning.
  • Generally PBT tends to run for a limited number of iterations -- the confidence in them comes from the fact that they get run in CI all the time. So I'd drop the note about them, and instead just say fuzzing.

ExecutionResult::Timeout {
result: SlowTimeoutResult::Fail,
} => {
self.failed += 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note this is a logic change and makes the failed_count function incorrect. Let's address this in a followup rather than changing the semantics here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, should I also move the changes to RunStats (adding passed_timed_out and renaming timed_out to failed_timed_out) into that followup? Or keep the struct changes in the first commit and address only the logic changes in a followup?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The struct changes have to go in the first commit, I think. The semantic changes to the meaning of things like failed should go in the followup.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah the failing tests are unfortunate -- can we split them up into another PR?

@eduardorittner eduardorittner force-pushed the on-timeout branch 5 times, most recently from c2d5c73 to c4e6773 Compare November 6, 2025 12:26
Adds a new configuration option inside "slow-timeout": "on-timeout"
which when set to "pass" makes nextest treat timed out tests as passing.
The logic for actually counting how many timed out tests were treated as
successes will be made in a followup commit.
@eduardorittner
Copy link
Author

So I've split off the changes into two commits, where the second one only deals with the logic in reporter/events.rs for counting the number of failed tests, timed out tests, etc. Some tests are now also failing because I made self.failed include self.failed_timed_out, instead of them being disjoint, which broke a bunch of the tests that relied on this logic. I can revert this and make them disjoint again if that's better

ExecutionResult::Timeout {
result: SlowTimeoutResult::Fail,
} => {
self.failed += 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah the failing tests are unfortunate -- can we split them up into another PR?

slow_timeout.grace_period,
).await;
status = Some(ExecutionResult::Timeout);
status = Some(ExecutionResult::Timeout {result: slow_timeout.on_timeout});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rustfmt should have fixed this -- did that not work? (Oh maybe not because this is inside a tokio::select! arm, ugh.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the "correct" formatting be this?

status = Some(ExecutionResult::Timeout { result: slow_timeout.on_timeout });

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, either that or across multiple lines:

status = Some(ExecutionResult::Timeout {
    result: slow_timeout.on_timeout,
});

Since timed out tests can now be treated as either successes or
failures, update the logic for handling them. Note that before
`self.failed` and `self.timed_out` were separate, but now `self.failed`
includes `self.failed_timed_out`.
@eduardorittner
Copy link
Author

Tests are now passing, I changed the logic in events.rs so that self.failed and self.failed_timed_out are disjoint and self.failed_count() is now

pub fn failed_count(&self) -> usize {
    self.failed + self.exec_failed + self.failed_timed_out
}

I haven't added any new tests, but I'll get started on that now that all the old existing tests are passing.

@sunshowers
Copy link
Member

Looks great -- yeah, just missing new tests otherwise this is good to go.

@eduardorittner
Copy link
Author

I have two questions regarding junit:

  1. Should passing timed out tests be reported as failures? I ask because in reporter/aggregator/junit.rs passing tests that leaked are reported as NonSuccessKind::Error: https://github.com/eduardorittner/nextest/blob/42f92f1ed6c33eba1965f93bdb862548764892b3/nextest-runner/src/reporter/aggregator/junit.rs#L353-L358

  2. How/where is junit support tested?

@sunshowers
Copy link
Member

  1. Good question. I think it should be marked as passed.
  2. Uhmmm, there aren't good tests for deserializing JUnit reports. But there probably should be. I wouldn't worry about that, at some point I'll have Claude Code or something write those tests.

@sunshowers
Copy link
Member

Uhmmm, there aren't good tests for deserializing JUnit reports. But there probably should be. I wouldn't worry about that, at some point I'll have Claude Code or something write those tests.

Did this in #2761.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Option to treat timeouts as non-failure

2 participants