Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

harden async communication routines #2232

Open
rrsettgast opened this issue Jan 3, 2023 · 0 comments
Open

harden async communication routines #2232

rrsettgast opened this issue Jan 3, 2023 · 0 comments
Assignees
Labels
type: cleanup / refactor Non-functional change (NFC)

Comments

@rrsettgast
Copy link
Member

Describe the issue
In #2173, a bug was discovered that for system dependent failures our ghost setup. The problem appeared to be that when the ghosting routines were rewritten for async the same MPI_iCommData object was used in multiple phases of communication, but there was no wait issues on the send requests. Thus the send requests were being reused before they where completed. A "fix" was implemented in #2230 where waits were issue for the sends.

Proposed cleanup
This is somewhat dangerous, and I think the problem lies in the fact that were were able to reuse MPI_iCommData without any checks on the status of the requests it holds. I propose that prior to usage of an MPI_Request, we first check the status to ensure that it had been completed (using MPI_Test?). Open to better solutions.

Also the destructor of the MPI_iCommData could test all requests to make sure that they have completed before the object is destroyed.

Additional context
Add any other context about the problem here.

@TotoGaz TotoGaz added the type: cleanup / refactor Non-functional change (NFC) label Jan 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: cleanup / refactor Non-functional change (NFC)
Projects
None yet
Development

No branches or pull requests

3 participants