Description
Describe the issue
In #2173, a bug was discovered that for system dependent failures our ghost setup. The problem appeared to be that when the ghosting routines were rewritten for async the same MPI_iCommData
object was used in multiple phases of communication, but there was no wait issues on the send requests. Thus the send requests were being reused before they where completed. A "fix" was implemented in #2230 where waits were issue for the sends.
Proposed cleanup
This is somewhat dangerous, and I think the problem lies in the fact that were were able to reuse MPI_iCommData
without any checks on the status of the requests it holds. I propose that prior to usage of an MPI_Request, we first check the status to ensure that it had been completed (using MPI_Test
?). Open to better solutions.
Also the destructor of the MPI_iCommData
could test all requests to make sure that they have completed before the object is destroyed.
Additional context
Add any other context about the problem here.