Skip to content

Conversation

@kriti-sc
Copy link

@kriti-sc kriti-sc commented Oct 6, 2025

Return 410 - GONE instead of 404 - NOT_FOUND when the running task instance heartbeats after task is cleared.

When a running task is cleared, the previous task instance is moved from the Task Instance table to the Task Instance History table. As a result, if the task instance hearbeats, the api server returns a 404. A more appropriate error is 410.

closes: #53140

Airflow task logs:

image

Scheduler and API server logs:

api-server | [2025-10-06T21:28:42.135650Z] {task_instances.py:608} ERROR - Task Instance not found in Task Instance table, might have moved to the Task Instance History table ti_id=0199bb6c-f0be-7686-a26e-30f70d893ae1
api-server | INFO:     127.0.0.1:53520 - "PUT /execution/task-instances/0199bb6c-f0be-7686-a26e-30f70d893ae1/heartbeat HTTP/1.1" 410 Gone
scheduler  | [2025-10-06T21:28:42.137967Z] {supervisor.py:1103} ERROR - Server indicated the task shouldn't be running anymore detail={'detail': {'reason': 'not_found', 'message': 'Task Instance not found, might have moved to the Task Instance History table'}} status_code=410 ti_id=UUID('0199bb6c-f0be-7686-a26e-30f70d893ae1')
scheduler  | [2025-10-06T21:28:42.144732Z] {supervisor.py:713} INFO - Process exited pid=96561 exit_code=0 signal_sent=SIGTERM
scheduler  | [2025-10-06T21:28:42.144989Z] {supervisor.py:1899} INFO - Task finished exit_code=0 duration=15.431881000055 final_state=SERVER_TERMINATED

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

@boring-cyborg boring-cyborg bot added area:API Airflow's REST/HTTP API area:task-sdk labels Oct 6, 2025
@boring-cyborg
Copy link

boring-cyborg bot commented Oct 6, 2025

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
Here are some useful points:

  • Pay attention to the quality of your code (ruff, mypy and type annotations). Our prek-hooks will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
  • Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: [email protected]
    Slack: https://s.apache.org/airflow-slack

@kriti-sc kriti-sc requested a review from amoghrajesh November 9, 2025 08:05
@amoghrajesh amoghrajesh changed the title #53140 Better error handling for running ti heartbeats after task is cleared Better error handling for running ti heartbeats after task is cleared Nov 11, 2025
status_code=status.HTTP_204_NO_CONTENT,
responses={
status.HTTP_404_NOT_FOUND: {"description": "Task Instance not found"},
status.HTTP_410_GONE: {"description": "Task Instance no longer exists, it may have moved to the Task Instance History table"},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 404 should still exist, there’s a difference between a ti that never existed and a ti that existed but is gone.

Copy link
Contributor

@amoghrajesh amoghrajesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these changes are insufficient, look at my comments

)
except NoResultFound:
log.error("Task Instance not found")
status.HTTP_410_GONE: {"description": "Task Instance no longer exists, it may have moved to the Task Instance History table"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is added here by mistake?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should actually validate that, check if it is present in the TIH table.

Two situations possible here:

  1. Task Instance doesnt exist and not in TIH table -> return 404
  2. Task Instance doesnt exist and present in TIH table -> return 410

detail={
"reason": "not_found",
"message": "Task Instance not found",
status.HTTP_410_GONE: {"description": "Task Instance no longer exists, it may have moved to the Task Instance History table"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formatting looks off -- static checks will fail

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:API Airflow's REST/HTTP API area:task-sdk

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Better error handling for running ti heartbeats after task is cleared

3 participants