Skip to content

Conversation

@chancancode
Copy link
Contributor

Previously, the Delayed::Job probe eagerly deserializes the payload_object in order to name the trace. This moves the work ahead of where it would normally happen, and in case of an error during deserialization (e.g. syntax error in the YAML payload, or a deleted database record as in the customer issue), the error will be raised outside of the block where it's normally be caught.

This fix defers the naming until a spot where we can guarantee DJ's successful deserialization, preventing the crashes while capturing the deserialization (database load) time within the trace.

If an error occurs prior to reaching that step, the trace will have a default "unknown" label.

It is possible that this would affect some traces that we previously were able to name successfully – specifically, if an error occurs in a lifecycle hook. If this turns out to be an issue, we can try and opportunistically check for @payload_object on the job instance during the :error and :failure hook, but this is reaching too deep into the implementation internals for my taste. And honestly it seems fair enough to categorize anything that never made it to the actual #perform as generic "unknown+error" traces.

Fixes #491

@cla-bot cla-bot bot added the cla-signed label Aug 21, 2025
@chancancode
Copy link
Contributor Author

cc @kyleschmolze

Previously, the `Delayed::Job` probe eagerly deserializes the
`payload_object` in order to name the trace. This moves the work
ahead of where it would normally happen, and in case of an error
during deserialization (e.g. syntax error in the YAML payload, or
a deleted database record as in the customer issue), the error will
be raised outside of the block where it's normally be caught.

This fix defers the naming until a spot where we can guarantee DJ's
successful deserialization, preventing the crashes while capturing
the deserialization (database load) time within the trace.

If an error occurs prior to reaching that step, the trace will have
a default "unknown" label.

It is possible that this would affect some traces that we previously
were able to name successfully – specifically, if an error occurs in
a lifecycle hook. If this turns out to be an issue, we can try and
opportunistically check for `@payload_object` on the `job` instance
during the `:error` and `:failure` hook, but this is reaching too
deep into the implementation internals for my taste. And honestly it
seems fair enough to categorize anything that never made it to the
actual `#perform` as generic "unknown+error" traces.

Fixes #491
@zvkemp zvkemp merged commit f060810 into main Aug 27, 2025
23 checks passed
@zvkemp zvkemp deleted the deplayed_job branch August 27, 2025 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DelayedJob probe causes worker crashes on deserialization errors

3 participants