-
Notifications
You must be signed in to change notification settings - Fork 314
fix(iroh)!: Correct the error structure #3663
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
For some odd reason AuthenticationError was given a branch that had quinn::ConnectionError inside it. But logically the connection error has nothing to do with authentication. That should have been a red flag. AuthenticationError itself is almost always wrapped in ConnectingError, which does correctly have a quinn::ConnectionError branch. And the few places where it was directly returned to the user it arguably **should** have been wrapped in a ConnectingError. The result of this is that before this fix you would get a very confusing authentication error if the remote client closed the connection right at the same time as the handshake completed for it (yes, this is difficult to do at the right time, and it only happens for the client since that completes the handshake one network hop before the server). But this was no authentication error, it is simply a closed connection. The new error structure captures this correctly. Similarly the InternalConsistencyError belongs on the ConnectingError. Though that one should be impossible to produce since it's supposed to be an invariant.
| // If the authentication error raced with a connection error, the connection | ||
| // error wins. | ||
| if let Some(conn_err) = conn.close_reason() { | ||
| return Err(e!(ConnectingError::ConnectionError { source: conn_err })); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so this actually fixes the racy thing I had. It took me a while to figure out why. But basically what happens is that because the handshake does complete on the client before the connection is closed, we never get the auth_err here, because all the info is available. So it never checks on whether the connection is already closes. Which in turn apparently means you just get a closed Connection. Which is pretty neat.
I don't think this means this branch can be removed though, because if the client closes before it completes the handshake you can get here again.
ramfox
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! Fixes a panic and makes the error reporting much more clear!
|
Ah, yes to answer your question: yes I think this should be made against main. |
I'm retracting this offer and will push it here only. Turns out main doesn't have the fallible EndpointStateActor::register_connection yet, and I don't really want to deal with losing that and then having to re-apply it when merging main into here. |
|
Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh/pr/3663/docs/iroh/ Last updated: 2025-11-14T09:14:12Z |
Description
For some odd reason AuthenticationError was given a branch that had
quinn::ConnectionError inside it. But logically the connection error
has nothing to do with authentication. That should have been a red
flag.
AuthenticationError itself is almost always wrapped in
ConnectingError, which does correctly have a quinn::ConnectionError
branch. And the few places where it was directly returned to the user
it arguably should have been wrapped in a ConnectingError.
The result of this is that before this fix you would get a very
confusing authentication error if the remote client closed the
connection right at the same time as the handshake completed for
it (yes, this is difficult to do at the right time, and it only
happens for the client since that completes the handshake one network
hop before the server). But this was no authentication error, it is
simply a closed connection. The new error structure captures this
correctly.
Similarly the InternalConsistencyError belongs on the ConnectingError.
Though that one should be impossible to produce since it's supposed to
be an invariant.
Breaking Changes
AuthenticationErrorloses theConnectionErrorandInternalConsistencyErrorbranches. Both are on theConnectingErrorinstead.OutgoingZeroRttConnection::handshake_completedandIncomingZeroRttConnectionnow return aConnectingErrorinsteadof
AuthenticationError.Notes & open questions
While I've managed to trigger this error somewhat occasionally using a
program that races the closing with the completed handshake in a very
tight loop before applying this fix. I'm completely failing to
trigger it since applying this fix, so I can admire the beautiful new
error reporting this fix should give. It's a bit confusing.
edit: it is confusing. But it is correct. Because my flaky failure does always close the connection after it completes the handshake. So ALPN and EndpointId are always available. And then you yield a valid
Connectionwhen awaiting anIncoming, it just is already closed.This fix could also be made against main. I believe the same commit should be able to be cherry-picked and will probably apply fairly clean. Do you think I should make it against main?
Change checklist