Skip to content

Conversation

@tatsuhiro-t
Copy link
Contributor

Previously, the compliance check seeks for the strings "exited with
code 127" or "exit status 127" in stdout + stderr. It turns out that
these strings might not be present in rare cases. In order to
workaround this, this commit directly checks the exit code of the
relevant container with docker-compose --exit-code-from flag. The
flag implies --abort-on-container-exit.

When running the actual test case, the interop runner checks whether
the test case is supported by an implementation. The same method
cannot be applied there because we only get an exit code from a single
service. However, the downside of not detecting unsupported test case
is not severe, it just results in failed test. In contrast, the
failed compliance check skips all test cases for the particular client
and server combination.

Previously, the compliance check seeks for the strings "exited with
code 127" or "exit status 127" in stdout + stderr.  It turns out that
these strings might not be present in rare cases.  In order to
workaround this, this commit directly checks the exit code of the
relevant container with docker-compose --exit-code-from flag.  The
flag implies --abort-on-container-exit.

When running the actual test case, the interop runner checks whether
the test case is supported by an implementation.  The same method
cannot be applied there because we only get an exit code from a single
service.  However, the downside of not detecting unsupported test case
is not severe, it just results in failed test.  In contrast, the
failed compliance check skips all test cases for the particular client
and server combination.
Copy link
Collaborator

@marten-seemann marten-seemann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would the string not be present in rare cases?

'SCENARIO="simple-p2p --delay=15ms --bandwidth=10Mbps --queue=25" '
"CLIENT=" + self._implementations[name]["image"] + " "
"docker-compose up --timeout 0 --abort-on-container-exit -V sim client"
"docker-compose up --timeout 0 --exit-code-from client -V sim client"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We’re starting multiple containers, and we can’t know which one exits first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does sim container exit before client container? I think a compliant client exits first with exit code 127, then docker-compose kills sim.

@tatsuhiro-t
Copy link
Contributor Author

Why would the string not be present in rare cases?

I do not know. docker-compose might do some funny stuff, and/or due to race condition.

@tatsuhiro-t
Copy link
Contributor Author

The manifestations of this issue in the recent run are:

https://github.com/marten-seemann/quic-interop-runner/actions/runs/3169239332/jobs/5161033391

Saving logs to logs.
ngtcp2 server not compliant.
Not compliant, skipping
Run took 0:00:06.716867
+------+--------+
|      | ngtcp2 |
+------+--------+
| neqo |        |
|      |        |
|      |        |
+------+--------+

https://github.com/marten-seemann/quic-interop-runner/actions/runs/3169239332/jobs/5161033427

Saving logs to logs.
ngtcp2 server not compliant.
Not compliant, skipping
Run took 0:00:05.832533
+--------+--------+
|        | ngtcp2 |
+--------+--------+
| msquic |        |
|        |        |
|        |        |
+--------+--------+

But ngtcp2 server is fully compliant in the other combinations.
This happens in the other implementation and not specific to ngtcp2 server.

@marten-seemann marten-seemann force-pushed the master branch 2 times, most recently from 365141c to e73ec56 Compare December 19, 2022 07:59
@larseggert
Copy link
Contributor

I see the same issue too, when I run locally. Spurious "non-compliant" errors that usually go away next run.

@marten-seemann
Copy link
Collaborator

@larseggert Does this PR fix the problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants