Enable Gxadmin Galaxy clean up cron task #1495

sanjaysrikakulam · 2025-04-23T08:32:08Z

This was disabled via:

This was disabled via: usegalaxy-eu#1388, usegalaxy-eu#1391

bgruening · 2025-04-23T08:53:04Z

@sj213 this might put back some load onto the PG. Can you keep an eye on this please ...

But we need the cleanup script ... we would need to enhance the query etc ... or adding indices.

sj213 · 2025-04-23T09:15:10Z

No argument here. But since this query is likely to run for a week or so and we all agree that this query badly wants some finishing, might I suggest that for the time being the query is not run via cron but rather manually, in a screen/tmux session and prefixed with EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT JSON)?

Two advantages:

Since the query is known to run several days and also block other instances of the same query that are started on the following days, we would have to forcibly terminate those queries started later anyway; we've been there before
By running the query with the full EXPLAIN ANALYZE monty, we obtain valuable data for the Dalibo analyzer. Two birds with one stone.

sanjaysrikakulam · 2025-04-23T09:44:38Z

No argument here. But since this query is likely to run for a week or so and we all agree that this query badly wants some finishing, might I suggest that for the time being the query is not run via cron but rather manually, in a screen/tmux session and prefixed with EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT JSON)?

Two advantages:
* Since the query is known to run several days and also block other instances of the same query that are started on the following days, we would have to forcibly terminate those queries started later anyway; we've been there before

* By running the query with the full EXPLAIN ANALYZE monty, we obtain valuable data for the Dalibo analyzer. Two birds with one stone.

Unfortunately, we cannot wrap it with EXPLAIN... because gxadmin is simply calling a Galaxy Python script, and the script uses the Galaxy models (SQLAlchemy) for doing all the things that it does.

bgruening · 2025-04-23T10:10:00Z

Stefan should be able to see the slow query on his side isn't it?

sj213 · 2025-04-23T10:54:43Z

Once it completes it should show up in the logs as one of the slow queries. And then the problem is which one of all the slow queries in the log it was.

But there may be another option: looking at /opt/galaxy/server/scripts/cleanup_datasets/pgcleanup.py I see this script supports a --debug option which should enable logging of the SQL queries sent to the server.

However, in /opt/gxadmin/partx/25-galaxy.sh the function galaxy_cleanup() does not pass the debug option to pgcleanup.py if the envar GXADMIN_DEBUG is set, so 25-galaxy.sh would need to be hacked a bit to enable query logging in galaxy_cleanup().

It should thus be possible to hack gxadmin, run it with debugging enabled to obtain the query text, abort the query submitted and then re-execute it from psql(1) with analyze enabled.

sj213 · 2025-04-23T11:56:49Z

Actually, I just remembered that there is an even simpler solution: pg_stat_activity(query) shows the query text of any running SQL statement. There is, however, a catch: By default this column is limited to 1 KiB and the text of our queries is often longer. The limit can be raised by adjusting track_activity_query_size but a change of this runtime parameter only takes effect at a server restart. (Damn, this would have been another bullet item for the last maintenance break...)

Nonetheless, I'll prepare a PR to raise the limit to 4 KiB, so the change will take effect at the next server restart.

bgruening · 2025-04-24T20:07:10Z

Ok, lets run this query tomorrow and see?

sanjaysrikakulam · 2025-04-25T06:56:25Z

Ok, lets run this query tomorrow and see?

The query is now running in a tmux session (for additional details, see the OP's chat).

gsaudade99 · 2025-12-04T14:06:45Z

This PR seems relevant, any news on this?

bgruening · 2025-12-04T14:57:59Z

Its modifying sn06. Can you please check if this is enabled in the new headnode?

gsaudade99 · 2025-12-05T10:20:37Z

It's not. This should clean the entries of histories/hdas/etc from the db.
Given we trying to centralize this type of scripts in maintenance I could migrate there. I would close this PR and open a new one for sake of simplicity.

bgruening · 2025-12-05T10:39:00Z

Sounds good!

Enable Gxadmin Galaxy clean up cron task

c39c311

This was disabled via: usegalaxy-eu#1388, usegalaxy-eu#1391

sanjaysrikakulam requested a review from bgruening April 23, 2025 08:32

bgruening approved these changes Apr 23, 2025

View reviewed changes

bgruening closed this Dec 5, 2025

gsaudade99 mentioned this pull request Dec 5, 2025

Enable Gxadmin Galaxy clean up cron task #1753

Open

Enable Gxadmin Galaxy clean up cron task #1495

Enable Gxadmin Galaxy clean up cron task #1495

Uh oh!

Conversation

sanjaysrikakulam commented Apr 23, 2025

Uh oh!

bgruening commented Apr 23, 2025

Uh oh!

sj213 commented Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sanjaysrikakulam commented Apr 23, 2025

Uh oh!

bgruening commented Apr 23, 2025

Uh oh!

sj213 commented Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sj213 commented Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bgruening commented Apr 24, 2025

Uh oh!

sanjaysrikakulam commented Apr 25, 2025

Uh oh!

gsaudade99 commented Dec 4, 2025

Uh oh!

bgruening commented Dec 4, 2025

Uh oh!

gsaudade99 commented Dec 5, 2025

Uh oh!

bgruening commented Dec 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

sj213 commented Apr 23, 2025 •

edited

Loading

sj213 commented Apr 23, 2025 •

edited

Loading

sj213 commented Apr 23, 2025 •

edited

Loading