Skip to content

Conversation

@EmmanuelBRELLE
Copy link
Contributor

PML cleanup may be incomplete with processes with different name.
This happens for example with spawned processes

Copy link
Member

@bosilca bosilca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the issue you raised but I don't think this is the correct solution. You are now removing all peers from the PML, even those that belong to other sessions.

A quick scan through the code seems to indicate that the same logic is used when sessions are created (use ompi_proc_get_allocated to get the list of processes to be added). This seems to indicate that this simple fix is not the right solution to the problem.

@EmmanuelBRELLE EmmanuelBRELLE force-pushed the finalize_frees_all_pml_procs branch from 6ae7472 to 100621e Compare December 8, 2025 16:32
@EmmanuelBRELLE
Copy link
Contributor Author

EmmanuelBRELLE commented Dec 8, 2025

I understand the issue you raised but I don't think this is the correct solution. You are now removing all peers from the PML, even those that belong to other sessions.

A quick scan through the code seems to indicate that the same logic is used when sessions are created (use ompi_proc_get_allocated to get the list of processes to be added). This seems to indicate that this simple fix is not the right solution to the problem.

That's a good point, thanks.
To fix this issue, the new proposal is to keep track of the list of spawned jobids related to the instance (=session). Only those jobids will be freed at instance finalize.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants