Skip to content

mpi_f08: MPI_Wait modifies MPI_ERROR field of status #13205

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gcorbin opened this issue Apr 22, 2025 · 4 comments
Closed

mpi_f08: MPI_Wait modifies MPI_ERROR field of status #13205

gcorbin opened this issue Apr 22, 2025 · 4 comments

Comments

@gcorbin
Copy link

gcorbin commented Apr 22, 2025

Background information

What version of Open MPI are you using? (e.g., v4.1.6, v5.0.1, git branch name and hash, etc.)

v5.0.5

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

From Easybuild

Please describe the system on which you are running

Observed on multiple systems, e.g.

  • Operating system/version: uname -r: 5.14.0-503.26.1.el9_5.x86_64
  • Computer hardware: Single node of Jureca-DC
  • Compiler: GCC 13.3.0 (same issue with NVHPC 24.9)

Details of the problem

MPI_Wait called via use mpi_f08 modifies status%MPI_ERROR.

In this example, status_received%MPI_ERROR is initialized to MPI_SUCCESS, and it should not be modified by MPI_Wait.
But after the MPI_Wait, it is some random value:

program send_irecv_wait

use :: mpi_f08

implicit none

integer :: wrank, wsize
integer :: errs, ierr

integer, parameter :: nelem = 20
integer :: buffer(nelem)
integer, parameter :: tag = 99

type(MPI_Status) :: status_received
type(MPI_Request) :: request

call MPI_Init(ierr)
call MPI_Comm_rank(MPI_COMM_WORLD, wrank, ierr)
call MPI_Comm_size(MPI_COMM_WORLD, wsize, ierr)

if ( wsize .lt. 2 ) then
    print *, "At least 2 MPI processes are needed for this test"
    call MPI_Abort(MPI_COMM_WORLD, 1, ierr)
end if

if ( wrank .eq. 0 ) then
    buffer = 0
    status_received%MPI_ERROR = MPI_SUCCESS
    call MPI_Irecv(buffer, nelem, MPI_INTEGER, 1, tag, MPI_COMM_WORLD, request, ierr)
    call MPI_Wait(request, status_received, ierr)
    if ( status_received%MPI_ERROR .ne. MPI_SUCCESS ) then
        print '("Wrong value of MPI_ERROR: ", i0)', status_received%MPI_ERROR
    end if
else if ( wrank .eq. 1 ) then
    buffer = 1
    call MPI_Send(buffer, nelem, MPI_INTEGER, 0, tag, MPI_COMM_WORLD, ierr)
end if

call MPI_Finalize(ierr)
end program

However, the program reports

Wrong value of MPI_ERROR: 381821048

(or some other random value)

I am aware of the related issue #12049, but this is different: The error field is initialized and should not be modified.

@jsquyres jsquyres added this to the v5.0.8 milestone Apr 22, 2025
@ggouaillardet
Copy link
Contributor

Under the hood, we use an uninitialized C MPI_Status, invoke the C PMPI_Wait() and then convert the C MPI_Status to Fortran.

Per @devreal comment

In general, message-passing calls do not modify the value of the error code field of
status variables. This field may be updated only by the functions in Section 3.7.5 that
return multiple statuses.

My understanding is the spirit of the standard is the MPI_ERROR field should not be accessed in this example.
The letter of the standard could read the MPI_ERROR field should not be modified in this example.
Assuming this is a correct interpretation, fixing this would require instead of using an uninitialized C MPI_Status, we first convect the input status with MPI_Status_f082c().

To me, that looks like a bozzo case (please correct me if I'm wrong) not worth fixing (since the intent was not to update/set MPI_ERROR for performance reason). That's a personal opinion and not a strong one though.

@jsquyres
Copy link
Member

I would tend to agree: the intent of the MPI standard to not modify MPI_ERROR was originally for performance optimization reasons. But "fixing" that here would actually invoke a performance cost (i.e., there's a few ways to solve this, but they all involve more work/CPU cycles).

@gcorbin Is there a problem you're running into in an application code that is causing an issue from this behavior?

@gcorbin
Copy link
Author

gcorbin commented Apr 23, 2025

I noticed this in test code, that does quite extensive validation, not a 'real' application. Going by the letter of the standard only, I assumed this was a bug. But since it is intentional, let's close this issue. I anything, it is something to be cleared up in the MPI standard.

@gcorbin gcorbin closed this as completed Apr 23, 2025
@jsquyres
Copy link
Member

FWIW: it's not intentional, but it's not unintentional, either. 😄 I.e., we'd be open to fixing it, but it would probably take a whole bunch of effort, and the end result may not be worth it (i.e., a small corner case that has not seemed to matter for years).

Thanks for reporting, @gcorbin! We're always willing to have the conversation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants