-
Notifications
You must be signed in to change notification settings - Fork 482
ore: ensure Sentry transport never shuts down #34050
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
def-
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a good way to write a test for this?
I think we'd need some way to run Materialize together with Sentry in tests. Then we could force a panic and check if it gets reported. The same thing would also have been useful with other Sentry reporting issues we had previously. |
Previously, the `mz_ore::tracing::configure` would initialize the Sentry transport and then return a `TracingGuard` that needed to be kept alive to prevent the Sentry transport from shutting down. Callers would usually keep the guard until the end of the current function scope. This caused us to miss reporting panics when a fatal error was bubbled up to the process' `main` before becoming a panic, and the guard was dropped in the process. The fix taken here is to immediately forget the guard, to make sure the Sentry transport remains intact for the process lifetime. If there is ever a need to shut down the Sentry transport again we'll need to reconsider, but we can keep things simple for now.
cf205f0 to
908943b
Compare
|
We have test/tracing/mzcompose.py, which reports to a special sentry DSN, which we can then check via the API to see if the expected log has made it to it? |
|
Ah, is that connected to our production Sentry instance? Yes, that would work. Though DSN authentication doesn't work to list issues in a project, we'd also need to supply a Sentry auth token. |
|
Should be the database-backend-testing project: https://materializeinc.sentry.io/insights/projects/database-backend-testing/?project=4506542270906368 |
ggevay
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
|
I don't have time to look at testing the Sentry integration now, and I don't think we should block this fix on that. So I'm going ahead with merging. I've opened an issue to track adding tests: https://github.com/MaterializeInc/database-issues/issues/9897 |
|
TFTR! |
Previously, the
mz_ore::tracing::configurewould initialize the Sentry transport and then return aTracingGuardthat needed to be kept alive to prevent the Sentry transport from shutting down. Callers would usually keep the guard until the end of the current function scope. This caused us to miss reporting panics when a fatal error was bubbled up to the process'mainbefore becoming a panic, and the guard was dropped in the process.The fix taken here is to immediately forget the guard, to make sure the Sentry transport remains intact for the process lifetime. If there is ever a need to shut down the Sentry transport again we'll need to reconsider, but we can keep things simple for now.
Motivation
Panics caused by fatal errors that bubble up to
mainare not reported to Sentry.Slack thread.
Tips for reviewer
Checklist
$T ⇔ Proto$Tmapping (possibly in a backwards-incompatible way), then it is tagged with aT-protolabel.