-
Notifications
You must be signed in to change notification settings - Fork 912
add MPI_ERR_ERRHANDLER error class #13342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
we may need to tweak mpi4py test. it doesn't handle being stuck between MPI 3.1 and MPI 4.1 compliance it appears. |
i think we'll switch to using an exclude file for mpi4py pytest. The problem that caused us to add -x test_doc has been fixed so we can safely replace it with test_exceptions for now, but it appears that the |
related to open-mpi#12082 also some fixes. Like the MPI_Errors.3 wasn't getting installed. mpi4py - disable testing MPI_ERR_ERRHANDLER. We can re-enable this once we actually support MPI 4.1. Signed-off-by: Howard Pritchard <[email protected]>
29b5b70
to
a5aa4fd
Compare
@edgargabriel could you recheck this? |
@hppritcha You should have notified me about this mpi4py failure! PS: This PR slightly broke the rules. |
Something to ponder: It is unlikely that OMPI will jump from one MPI level to another in a single step. More commonly, they will introduce some support for the next level at a one-step-per-PR procedure, which means they will be in some in-between state with respect to the MPI standard. OMPI probably won't update the MPI_VERSION until everything has been completed and they are completely compliant with the next version of the standard. So if you are basing tests on the MPI version, you may continually hit these problems. Not sure of a solution, but thought it might be worth pondering how to handle these "hybrid" situations. Not sure how OMPI handles it in releases, really, since a release may be in some in-between state as well? @hppritcha @jsquyres Any thoughts on how you've done that in the past that might help steer @dalcinl ? |
@dalcinl I did not want to bother you with this because we are going to keep "breaking the rules" as we stage in 4.1 functionality without updating our VERSION file. This is probably going to go on for the next several months. |
If we aim to give users detailed insights into new MPI features before reaching a major MPI milestone, we could introduce a |
Intriguing idea! What we did for PMIx was to modify the version string by adding another element to it. In our case, the standard only has two fields (e.g., 3.1), so we added a third to represent a step above that version. So when we release with something beyond (for example) 3.1, we mark it as 3.1.1. If a later release adds some more features but still doesn't fully meet the next standard version, then we increment it to 3.1.2. You still have to read the release notes to see what features from the next version were included, but at least it indicates that we added something. Any thoughts on how to deal with the automated testing problem? You can't just generate an error because a test returned an unexpected value that in fact is defined in a standard version above the one you expected. Like I said, moving from one standard version to the next is not a binary step, so you know that elements of the next version are slowly going to make their way into the repo and into releases over some period of time. |
Guys, let's not overreact and over-engineer the feature addition process. I'm totally fine with dealing with regressions that are for the advancement of the MPI standard and the Open MPI project. This particular regression is a bit special, it is an API addition (new error class constant) that that triggered a change in runtime behavior (that error class being returned from a previous existing API). My only gentle request is that you guys keep me in the loop, and if you have a mpi4py-related problem, just "@"mention me in any PR and I'll help to sort out things from mpi4py's side. I would rather update mpi4py and its test suite ASAP rather than you guys maintaining test exclusion lists that may get outdated and eventually hide actual bugs. |
We can't control ourselves. |
I think the concern wasn't so much about how to deal with this specific issue, but rather: (a) how do we communicate to users the actual state of the MPI compliance of an OMPI release? It would be an anomaly for OMPI to release a "pure" compliant state - it is almost either "nearly" compliant with some MPI standard version, or "fully" compliant with a version, but with a few additions from the next version. How does a user write an application that can deal with this situation? What can/should we provide to help them? (b) how does someone like yourself handle this "hybrid" situation when writing an automated CI or tester? It's nice if one can test for non-compliant behavior or return values - but how do you differentiate an actual error from something that actually complies with an MPI standard version beyond the base one supported by the OMPI version under test? Using it for CI means we have to avoid false error reports, but trying to add all that logic to the tester to support change-by-change movement seems unrealistic. Likewise, simply deleting problem tests creates, as you note, potential issues down-the-road. I can see ways to deal with (a), and George's suggestion was in that realm - resolving (b) seems a tad tougher as you don't want to do something that can lead to not finding errors down the road, or is something people forget to update as things evolve. Minimal maintenance on all sides would be desirable. |
@rhc54 Intel MPI has a way to handle this issue. Look for |
related to #12082
also some fixes. Like the MPI_Errors.3 wasn't getting installed.