You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the issue
In #2173, a bug was discovered that for system dependent failures our ghost setup. The problem appeared to be that when the ghosting routines were rewritten for async the same MPI_iCommData object was used in multiple phases of communication, but there was no wait issues on the send requests. Thus the send requests were being reused before they where completed. A "fix" was implemented in #2230 where waits were issue for the sends.
Proposed cleanup
This is somewhat dangerous, and I think the problem lies in the fact that were were able to reuse MPI_iCommData without any checks on the status of the requests it holds. I propose that prior to usage of an MPI_Request, we first check the status to ensure that it had been completed (using MPI_Test?). Open to better solutions.
Also the destructor of the MPI_iCommData could test all requests to make sure that they have completed before the object is destroyed.
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered:
Describe the issue
In #2173, a bug was discovered that for system dependent failures our ghost setup. The problem appeared to be that when the ghosting routines were rewritten for async the same
MPI_iCommData
object was used in multiple phases of communication, but there was no wait issues on the send requests. Thus the send requests were being reused before they where completed. A "fix" was implemented in #2230 where waits were issue for the sends.Proposed cleanup
This is somewhat dangerous, and I think the problem lies in the fact that were were able to reuse
MPI_iCommData
without any checks on the status of the requests it holds. I propose that prior to usage of an MPI_Request, we first check the status to ensure that it had been completed (usingMPI_Test
?). Open to better solutions.Also the destructor of the
MPI_iCommData
could test all requests to make sure that they have completed before the object is destroyed.Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: