Skip to content

TTL for TrainJobs #2899

@stivanov-intercom

Description

@stivanov-intercom

What you would like to be added?

Kubernetes Jobs and JobSets support the TTL field that enables automatic resource cleanup. The TrainJob resource allows passing these TTLs down to the JobSet/Job level, which works for cleaning up pods, but the TrainJob resource itself remains indefinitely. It would be nice if there would be a TTL option implemented for the TrainJob resource, so we don't have to create additional systems on top to clean them up.

Why is this needed?

Without this, all the trainjob resources live on the cluster indefinitely after they have finished running and users are required to either manually clean them up or create cron jobs to clean them up. Would be nice to have parity with k8s jobs/jobsets.

Love this feature?

Give it a 👍 We prioritize the features with most 👍

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions