Skip to content

harden async communication routines  #2232

Open
@rrsettgast

Description

@rrsettgast

Describe the issue
In #2173, a bug was discovered that for system dependent failures our ghost setup. The problem appeared to be that when the ghosting routines were rewritten for async the same MPI_iCommData object was used in multiple phases of communication, but there was no wait issues on the send requests. Thus the send requests were being reused before they where completed. A "fix" was implemented in #2230 where waits were issue for the sends.

Proposed cleanup
This is somewhat dangerous, and I think the problem lies in the fact that were were able to reuse MPI_iCommData without any checks on the status of the requests it holds. I propose that prior to usage of an MPI_Request, we first check the status to ensure that it had been completed (using MPI_Test?). Open to better solutions.

Also the destructor of the MPI_iCommData could test all requests to make sure that they have completed before the object is destroyed.

Additional context
Add any other context about the problem here.

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions