Skip to content

Conversation

@sanjaysrikakulam
Copy link
Member

@bgruening
Copy link
Member

@sj213 this might put back some load onto the PG. Can you keep an eye on this please ...

But we need the cleanup script ... we would need to enhance the query etc ... or adding indices.

@sj213
Copy link
Contributor

sj213 commented Apr 23, 2025

No argument here. But since this query is likely to run for a week or so and we all agree that this query badly wants some finishing, might I suggest that for the time being the query is not run via cron but rather manually, in a screen/tmux session and prefixed with EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT JSON)?

Two advantages:

  • Since the query is known to run several days and also block other instances of the same query that are started on the following days, we would have to forcibly terminate those queries started later anyway; we've been there before
  • By running the query with the full EXPLAIN ANALYZE monty, we obtain valuable data for the Dalibo analyzer. Two birds with one stone.

@sanjaysrikakulam
Copy link
Member Author

No argument here. But since this query is likely to run for a week or so and we all agree that this query badly wants some finishing, might I suggest that for the time being the query is not run via cron but rather manually, in a screen/tmux session and prefixed with EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT JSON)?

Two advantages:

* Since the query is known to run several days and also block other instances of the same query that are started on the following days, we would have to forcibly terminate those queries started later anyway; we've been there before

* By running the query with the full EXPLAIN ANALYZE monty, we obtain valuable data for the Dalibo analyzer. Two birds with one stone.

Unfortunately, we cannot wrap it with EXPLAIN... because gxadmin is simply calling a Galaxy Python script, and the script uses the Galaxy models (SQLAlchemy) for doing all the things that it does.

@bgruening
Copy link
Member

Stefan should be able to see the slow query on his side isn't it?

@sj213
Copy link
Contributor

sj213 commented Apr 23, 2025

Once it completes it should show up in the logs as one of the slow queries. And then the problem is which one of all the slow queries in the log it was.

But there may be another option: looking at /opt/galaxy/server/scripts/cleanup_datasets/pgcleanup.py I see this script supports a --debug option which should enable logging of the SQL queries sent to the server.

However, in /opt/gxadmin/partx/25-galaxy.sh the function galaxy_cleanup() does not pass the debug option to pgcleanup.py if the envar GXADMIN_DEBUG is set, so 25-galaxy.sh would need to be hacked a bit to enable query logging in galaxy_cleanup().

It should thus be possible to hack gxadmin, run it with debugging enabled to obtain the query text, abort the query submitted and then re-execute it from psql(1) with analyze enabled.

@sj213
Copy link
Contributor

sj213 commented Apr 23, 2025

Actually, I just remembered that there is an even simpler solution: pg_stat_activity(query) shows the query text of any running SQL statement. There is, however, a catch: By default this column is limited to 1 KiB and the text of our queries is often longer. The limit can be raised by adjusting track_activity_query_size but a change of this runtime parameter only takes effect at a server restart. (Damn, this would have been another bullet item for the last maintenance break...)

Nonetheless, I'll prepare a PR to raise the limit to 4 KiB, so the change will take effect at the next server restart.

@bgruening
Copy link
Member

Ok, lets run this query tomorrow and see?

@sanjaysrikakulam
Copy link
Member Author

Ok, lets run this query tomorrow and see?

The query is now running in a tmux session (for additional details, see the OP's chat).

@gsaudade99
Copy link
Contributor

This PR seems relevant, any news on this?

@bgruening
Copy link
Member

Its modifying sn06. Can you please check if this is enabled in the new headnode?

@gsaudade99
Copy link
Contributor

It's not. This should clean the entries of histories/hdas/etc from the db.
Given we trying to centralize this type of scripts in maintenance I could migrate there. I would close this PR and open a new one for sake of simplicity.

@bgruening
Copy link
Member

Sounds good!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants