add object store cleanup script #1734

gsaudade99 · 2025-11-27T11:31:07Z

Xref: https://github.com/usegalaxy-eu/issues/issues/521#issuecomment-2400184860

PR is a copy of this. Should be closed after closing the PR.

Migrated script to maintenance VM.
Added Object Store cleanup for temporary (scratch) storage on crontab.

kysrpex

Thanks for working on this (and putting it on the right host), we have left it dead for so long 😅.

In addition to the very minor stuff I mentioned, also the "good practice" would be to rename the role from usegalaxy-eu.galaxy-cleanup to usegalaxy_eu.galaxy_cleanup. It's not a hard requirement but Ansible does it internally and it's the rule for Ansible Galaxy (even if we are not publishing this one).

roles/usegalaxy-eu.galaxy-cleanup/defaults/main.yml

group_vars/maintenance.yml

roles/usegalaxy-eu.galaxy-cleanup/templates/galaxy_cleanup_objectstores.sh

roles/usegalaxy-eu.galaxy-cleanup/defaults/main.yml

roles/usegalaxy-eu.galaxy-cleanup/tasks/main.yml

roles/usegalaxy-eu.galaxy-cleanup/templates/galaxy_cleanup_objectstores.sh

roles/usegalaxy-eu.galaxy-cleanup/tasks/main.yml

Co-authored-by: José Manuel Domínguez <[email protected]>

…ectstores.sh Co-authored-by: José Manuel Domínguez <[email protected]>

Co-authored-by: José Manuel Domínguez <[email protected]>

Copilot

Pull request overview

This PR adds an automated object store cleanup script to handle purging old datasets from temporary (scratch) storage. The implementation migrates the cleanup functionality to the maintenance VM and sets up a scheduled cron job to run the cleanup operations daily.

Key Changes:

Introduces a new Ansible role usegalaxy-eu.galaxy-cleanup with a shell script template that executes Galaxy's pgcleanup.py script
Configures automated daily cleanup via cron job (default: 1:00 AM)
Adds configuration for the s3_scratch_netapp01 object store with a 60-day retention policy

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 9 comments.

Show a summary per file

File	Description
roles/usegalaxy-eu.galaxy-cleanup/templates/galaxy_cleanup_objectstores.sh	Shell script template that executes pgcleanup.py for each configured object store
roles/usegalaxy-eu.galaxy-cleanup/tasks/main.yml	Ansible tasks to install the cleanup script and configure the cron job
roles/usegalaxy-eu.galaxy-cleanup/defaults/main.yml	Default configuration variables including example structure and cron schedule
maintenance.yml	Integrates the galaxy-cleanup role into the maintenance playbook
group_vars/maintenance.yml	Configures cleanup for the s3_scratch_netapp01 object store with 60-day retention

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

roles/usegalaxy-eu.galaxy-cleanup/templates/galaxy_cleanup_objectstores.sh

roles/usegalaxy-eu.galaxy-cleanup/defaults/main.yml

roles/usegalaxy-eu.galaxy-cleanup/tasks/main.yml

roles/usegalaxy-eu.galaxy-cleanup/templates/galaxy_cleanup_objectstores.sh

roles/usegalaxy-eu.galaxy-cleanup/defaults/main.yml

roles/usegalaxy-eu.galaxy-cleanup/templates/galaxy_cleanup_objectstores.sh

roles/usegalaxy-eu.galaxy-cleanup/defaults/main.yml

…ectstores.sh Co-authored-by: José Manuel Domínguez <[email protected]>

Co-authored-by: José Manuel Domínguez <[email protected]>

Co-authored-by: Copilot <[email protected]>

gsaudade99 · 2025-12-03T16:05:53Z

This should now be ready for merge @kysrpex

group_vars/maintenance.yml

roles/usegalaxy-eu.galaxy-cleanup/defaults/main.yml

bgruening · 2025-12-03T16:16:58Z

roles/usegalaxy_eu.galaxy_cleanup/templates/clean_scratch_storage.sh.j2

+set -e
+
+{% for item in galaxy_cleanup_objectstores | default([]) %}
+{{ galaxy_venv_dir }}/bin/python {{ galaxy_server_dir }}/scripts/cleanup_datasets/pgcleanup.py -c {{ galaxy_config_file }} -o {{ item.days }} -l {{ galaxy_log_dir }} -w 128MB --object-store-id {{ item.objectstore_id }} purge_old_hdas 2>&1 | tee -a {{ galaxy_log_dir }}/cleanup-$(date --rfc-3339=seconds)-purge_old_hdas.log


what happens if item.objectstore_id is empty or wrong/does-not-exist?

My understanding is that the jinja will try to render nothing (empty string, thus becoming --object-store-id purge_old_hdas). Looking at pgcleanup.py script it would fail because there is no args.actions for parsing.

We can add and extra tasks that checks if galaxy_cleanup_objectstores is empty and galaxy_cleanup_objectstores.objectstore_id does not include "scracth". And if we still don't feel "safe" we can add this condition inside the template

This template task should not be executed if galaxy_cleanup_scratchstorage does not match the expression in the assertion task above:
galaxy_cleanup_scratchstorage | selectattr('objectstore_id', 'search', 'scratch') | list | length > 0

I think the following failures are expected in these kind of cases (so we should be fine).

fatal: [localhost]: FAILED! => {"changed": false, "msg": "Task failed: Finalization of task args for 'ansible.builtin.copy' failed: Error while resolving value for 'content': object of type 'dict' has no attribute 'key_missing'"} fatal: [localhost]: FAILED! => {"changed": false, "msg": "Task failed: Finalization of task args for 'ansible.builtin.copy' failed: Error while resolving value for 'content': 'missing_var' is undefined"}

I think a complete playbook failure is the only way we can reliably see broken roles / variables in Jenkins.

If we pass a faulty --object-store-id to the pgcleanup.py, do we know how it will behave?

It failed pretty quickly and just roolbacks the changes

(venv) galaxy@sn09:~$ /opt/galaxy/venv/bin/python /opt/galaxy/server/scripts/cleanup_datasets/pgcleanup.py --dry-run -c /opt/galaxy/config/galaxy.yml -o 60 -l /var/log/galaxy -w 128MB --object-store-id im-wrong purge_old_hdas 2>&1 | tee -a /var/log/galaxy/cleanup-$(date --rfc-3339=seconds)-purge_old_hdas.log tee: '15:00:42+01:00-purge_old_hdas.log': Permission denied 2025-12-10 15:00:46,801 WARNING _update_raw_config_from_kwargs(): Option openai_api_key has been deprecated in favor of ai_api_key 2025-12-10 15:00:46,802 WARNING resolve(): Trying to resolve path for the 'email_ban_file' option but it's empty/None 2025-12-10 15:00:48,109 INFO run(): Running action 'purge_old_hdas': 2025-12-10 15:00:48,439 INFO _init(): Initializing object store for action purge_old_hdas 2025-12-10 15:00:48,440 INFO conn(): Connecting to database with URL: postgresql://galaxy:***@sn11.galaxyproject.eu/galaxy 2025-12-10 15:00:48,535 INFO conn(): Setting work_mem to 128MB 2025-12-10 15:00:48,538 INFO _dry_run_event(): Not executing event creation (increments sequence even when rolling back), using an old event ID (46003) for dry run 2025-12-10 15:00:48,539 INFO _execute(): Executing SQL 2025-12-10 15:00:48,552 INFO _execute(): Database status: SELECT 0 2025-12-10 15:00:48,552 INFO _update(): Update resulted in no changes, rolling back transaction 2025-12-10 15:00:48,552 INFO recalculate_disk_usage(): Recalculating disk usage for users whose data were purged 2025-12-10 15:00:48,553 INFO run(): Finished purge_old_hdas

bgruening · 2025-12-03T16:21:59Z

Maybe we should tweak the wording a bit.

Imho an object_store_cleanup is purging already deleted datasets, or purging delted histories, purging delted users etc ...
this is imho done with https://github.com/usegalaxy-eu/infrastructure-playbook/blob/master/group_vars/sn09/sn09.yml#L73-L82

What is done in this PR is more dangerous. It actually deletes data that is older than 60 days, even if the data is NOT deleted and purged. So maybe this PR but also the script should be named something like "clean_scratch_storage.sh"

roles/usegalaxy_eu.galaxy_cleanup/tasks/main.yml

Co-authored-by: Mira <[email protected]>

roles/usegalaxy_eu.galaxy_cleanup/tasks/main.yml

mira-miracoli · 2025-12-10T12:33:43Z

That seems to work:

cat test.yml
- name: test
  hosts: localhost
  vars:
    galaxy_cleanup_scratchstorage:
      - objectstore_id: "scratch"
      - objectstore_id: "test"
  tasks:
    - name: Check that all objectstore_id include 'scratch'
      assert:
        that:
          - galaxy_cleanup_scratchstorage
            | rejectattr('objectstore_id', 'search', 'scratch')
            | list
            | length == 0
        fail_msg: "Found objectstore_id entries that do NOT contain 'scratch'"
        success_msg: "All objectstore_id values contain 'scratch'"

ansible-playbook -c local -i localhost test.yml 

PLAY [test] **********************************************************************************************************************************************************************************

TASK [Gathering Facts] ***********************************************************************************************************************************************************************
ok: [localhost]

TASK [Check that all objectstore_id include 'scratch'] ***************************************************************************************************************************************
fatal: [localhost]: FAILED! => {
    "assertion": "galaxy_cleanup_scratchstorage | rejectattr('objectstore_id', 'search', 'scratch') | list | length == 0",
    "changed": false,
    "evaluated_to": false,
    "msg": "Found objectstore_id entries that do NOT contain 'scratch'"
}

PLAY RECAP ***********************************************************************************************************************************************************************************
localhost                  : ok=1    changed=0    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0   

cat test.yml
- name: test
  hosts: localhost
  vars:
    galaxy_cleanup_scratchstorage:
      - objectstore_id: "scratch"
      - objectstore_id: "scratch-2"
  tasks:
    - name: Check that all objectstore_id include 'scratch'
      assert:
        that:
          - galaxy_cleanup_scratchstorage
            | rejectattr('objectstore_id', 'search', 'scratch')
            | list
            | length == 0
        fail_msg: "Found objectstore_id entries that do NOT contain 'scratch'"
        success_msg: "All objectstore_id values contain 'scratch'"

ansible-playbook -c local -i localhost test.yml 

PLAY [test] **********************************************************************************************************************************************************************************

TASK [Gathering Facts] ***********************************************************************************************************************************************************************
ok: [localhost]

TASK [Check that all objectstore_id include 'scratch'] ***************************************************************************************************************************************
ok: [localhost] => {
    "changed": false,
    "msg": "All objectstore_id values contain 'scratch'"
}

PLAY RECAP ***********************************************************************************************************************************************************************************
localhost                  : ok=2    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

gsaudade99 · 2025-12-10T12:45:15Z

@mira-miracoli really cool way to test lol. I did some more testing and:

➜  /tmp cat test.yml                                    
- name: test
  hosts: localhost
  vars:
    galaxy_cleanup_scratchstorage: []
#    galaxy_cleanup_scratchstorage:
#      - objectstore_id: "scratch"
#      - objectstore_id: "scratch-2"
  tasks:
    - name: Check that all objectstore_id include 'scratch'
      assert:
        that:
#          - galaxy_cleanup_scratchstorage | length > 0
          - galaxy_cleanup_scratchstorage
            | rejectattr('objectstore_id', 'search', 'scratch')
            | list
            | length == 0
        fail_msg: "Found objectstore_id entries that do NOT contain 'scratch'"
        success_msg: "All objectstore_id values contain 'scratch'"
➜  /tmp ansible-playbook -c local -i localhost test.yml 

[WARNING]: Unable to parse /tmp/localhost as an inventory source
[WARNING]: No inventory was parsed, only implicit localhost is available
[WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match 'all'

PLAY [test] ************************************************************************************************************************************************************************************************************************

TASK [Gathering Facts] *************************************************************************************************************************************************************************************************************
ok: [localhost]

TASK [Check that all objectstore_id include 'scratch'] *****************************************************************************************************************************************************************************
ok: [localhost] => {
    "changed": false,
    "msg": "All objectstore_id values contain 'scratch'"
}

PLAY RECAP *************************************************************************************************************************************************************************************************************************
localhost                  : ok=2    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

I will add the commented condition too.

roles/usegalaxy_eu.galaxy_cleanup/templates/clean_scratch_storage.sh

…age.sh Co-authored-by: Mira <[email protected]>

roles/usegalaxy_eu.galaxy_cleanup/tasks/main.yml

gsaudade99 added 3 commits November 27, 2025 11:43

migrated - add object store clean up script - to sn09

1c37cff

migrate to maintenance instead

76be6f5

add crontab to the cleanup script

fc16410

gsaudade99 requested a review from kysrpex November 27, 2025 11:31

gsaudade99 self-assigned this Nov 27, 2025

gsaudade99 changed the title ~~Feature/object store cleanup~~ add object store cleanup script Nov 27, 2025

kysrpex marked this pull request as ready for review November 27, 2025 12:29

kysrpex reviewed Nov 27, 2025

View reviewed changes

Update roles/usegalaxy-eu.galaxy-cleanup/defaults/main.yml

85a8530

Co-authored-by: José Manuel Domínguez <[email protected]>

Copilot AI review requested due to automatic review settings December 3, 2025 15:43

gsaudade99 and others added 2 commits December 3, 2025 16:43

Update group_vars/maintenance.yml

79ea697

Co-authored-by: José Manuel Domínguez <[email protected]>

Update roles/usegalaxy-eu.galaxy-cleanup/templates/galaxy_cleanup_obj…

254be6b

…ectstores.sh Co-authored-by: José Manuel Domínguez <[email protected]>

Copilot started reviewing on behalf of gsaudade99 December 3, 2025 15:44 View session

gsaudade99 and others added 3 commits December 3, 2025 16:44

Update roles/usegalaxy-eu.galaxy-cleanup/defaults/main.yml

745dbe2

Co-authored-by: José Manuel Domínguez <[email protected]>

Update roles/usegalaxy-eu.galaxy-cleanup/tasks/main.yml

2914b70

Co-authored-by: José Manuel Domínguez <[email protected]>

Update roles/usegalaxy-eu.galaxy-cleanup/tasks/main.yml

e715b08

Co-authored-by: José Manuel Domínguez <[email protected]>

Copilot finished reviewing on behalf of gsaudade99 December 3, 2025 15:45

Copilot AI reviewed Dec 3, 2025

View reviewed changes

gsaudade99 and others added 4 commits December 3, 2025 16:48

Update roles/usegalaxy-eu.galaxy-cleanup/templates/galaxy_cleanup_obj…

0e0b09b

…ectstores.sh Co-authored-by: José Manuel Domínguez <[email protected]>

Update roles/usegalaxy-eu.galaxy-cleanup/tasks/main.yml

16b5891

Co-authored-by: José Manuel Domínguez <[email protected]>

Update roles/usegalaxy-eu.galaxy-cleanup/tasks/main.yml

f3e3b10

Co-authored-by: José Manuel Domínguez <[email protected]>

Update roles/usegalaxy-eu.galaxy-cleanup/tasks/main.yml

cc7f9b6

Co-authored-by: Copilot <[email protected]>

bgruening reviewed Dec 3, 2025

View reviewed changes

group_vars/maintenance.yml Show resolved Hide resolved

bgruening reviewed Dec 3, 2025

View reviewed changes

roles/usegalaxy-eu.galaxy-cleanup/defaults/main.yml Outdated Show resolved Hide resolved

bgruening reviewed Dec 3, 2025

View reviewed changes

gsaudade99 added 2 commits December 3, 2025 17:54

change dir name

fc12672

add check to scratchstorage + warning message

5199a0d

mira-miracoli reviewed Dec 4, 2025

View reviewed changes

roles/usegalaxy_eu.galaxy_cleanup/tasks/main.yml Outdated Show resolved Hide resolved

mira-miracoli reviewed Dec 4, 2025

View reviewed changes

roles/usegalaxy_eu.galaxy_cleanup/tasks/main.yml Outdated Show resolved Hide resolved

mira-miracoli reviewed Dec 4, 2025

View reviewed changes

roles/usegalaxy_eu.galaxy_cleanup/tasks/main.yml Outdated Show resolved Hide resolved

gsaudade99 and others added 4 commits December 4, 2025 10:51

Update roles/usegalaxy_eu.galaxy_cleanup/tasks/main.yml

7e0842a

Co-authored-by: Mira <[email protected]>

Update roles/usegalaxy_eu.galaxy_cleanup/tasks/main.yml

539498e

Co-authored-by: Mira <[email protected]>

Update roles/usegalaxy_eu.galaxy_cleanup/tasks/main.yml

932572e

Co-authored-by: Mira <[email protected]>

change role name on playbook

2d91941

kysrpex reviewed Dec 10, 2025

View reviewed changes

roles/usegalaxy_eu.galaxy_cleanup/tasks/main.yml Outdated Show resolved Hide resolved

use reject condition instead

7e9c087

kysrpex approved these changes Dec 10, 2025

View reviewed changes

fail when galaxy_cleanup_scratchstorage is not defined

a40f5c5

mira-miracoli reviewed Dec 10, 2025

View reviewed changes

roles/usegalaxy_eu.galaxy_cleanup/templates/clean_scratch_storage.sh Outdated Show resolved Hide resolved

Update roles/usegalaxy_eu.galaxy_cleanup/templates/clean_scratch_stor…

15d56b5

…age.sh Co-authored-by: Mira <[email protected]>

mira-miracoli reviewed Dec 10, 2025

View reviewed changes

roles/usegalaxy_eu.galaxy_cleanup/tasks/main.yml Outdated Show resolved Hide resolved

rm block + change file name

6f5e991

add object store cleanup script #1734

Are you sure you want to change the base?

add object store cleanup script #1734

Uh oh!

Conversation

gsaudade99 commented Nov 27, 2025

Uh oh!

kysrpex left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gsaudade99 commented Dec 3, 2025

Uh oh!

Uh oh!

Uh oh!

bgruening Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

gsaudade99 Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

mira-miracoli Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

kysrpex Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mira-miracoli Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mira-miracoli Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

gsaudade99 Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bgruening commented Dec 3, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mira-miracoli commented Dec 10, 2025

Uh oh!

gsaudade99 commented Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

kysrpex left a comment •

edited

Loading

kysrpex Dec 10, 2025 •

edited

Loading

mira-miracoli Dec 10, 2025 •

edited

Loading

gsaudade99 Dec 10, 2025 •

edited

Loading

gsaudade99 commented Dec 10, 2025 •

edited

Loading