Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenVSP is not deterministic in spanwise distribution of parametric coordinates on blended wings. #58

Open
anilyil opened this issue Jan 18, 2021 · 7 comments

Comments

@anilyil
Copy link
Contributor

anilyil commented Jan 18, 2021

Description

OpenVSP has a bug, where it can give two different spanwise parametric coordinate distributions on blended wings. The code seems to "lock" onto an execution path during initialization, and is consistent with design changes after initialization. However, we run multiple instances of OpenVSP on multiple procs, as a result, it is highly likely that the different instances running on different procs do not agree on the spanwise parametric coordinates. Even though the actual differences are small, this introduces very large errors when we do parallel finite-differencing. Currently, the code avoids this issue by doing the FD computations completely (base and perturbed) on the proc that perturbs the DV, instead of doing the base evaluation on the proc that owns the point, and the perturbed evaluation on the proc that perturbs the DV.

OpenVSP version of at least 3.21.1

I have a test script for this bug. I will check with all versions of OpenVSP and try to see if they ended up fixing it in a recent release. If not, I will contact the dev team (again) and point them towards the issue.

@anilyil anilyil mentioned this issue Jan 18, 2021
5 tasks
@ramcdona
Copy link

ramcdona commented Apr 2, 2022

I vaguely recall this being brought to my attention many years ago -- but at the time, it could not be duplicated and there was no MWE that could demonstrate the bug. I had essentially nothing to go on to chase this bug.

If you have a minimal test case that can replicate this issue on a recent version, I'm happy to spend some time trying to run this down.

@anilyil
Copy link
Contributor Author

anilyil commented Apr 4, 2022

Hi @ramcdona! Thanks for following up with this issue. This issue has been introducing minor errors in our derivatives; they were large enough for us to notice but small enough so that the students (including me) who needed to graduate did not try to address this issue directly.

Now that you followed up here, I dusted off my test that I created during the 2019 workshop, and then tested it with different OpenVSP versions. It actually looks like 3.27.1 actually fixed it! My heart tells me to just stick with version 4.1.22 but not sure if project managers would like that. Looks like 4.1.22 would have definitely fixed it for good.

Jokes aside, I can recreate the issue up to 3.26.1 on ubuntu-20 using gcc7.5 and python 3.7.7. With the same setup, I could not re-create the issue on 3.27.1 Looking at the extent of the changelog for 3.27, I am not too surprised that this is fixed. Also I did not even try 3.27.0 after seeing the changelog for 3.27.1.

I will follow up with you offline and get you the test I have. We can also use our public docker images to replicate the issue on older VSP versions and if you prefer, we can include this test going into the future. Once we are sure the issue is fixed, I can come back and close this issue and #70. Thanks again for following up!

@ramcdona
Copy link

ramcdona commented Apr 4, 2022

You are correct, 3.27.1 only fixes a GUI issue and will be identical to 3.27.0 from the API.

Looking at the changes between 3.26.1 and 3.27, I don't really see anything that should have influenced this. Most of the things that I thought might have helped should have been in 3.26.1.

I won't look a gift horse in the mouth -- but I also don't quite trust it. Lets keep the test around for a while and I appreciate you sending me a test case if you can.

Just think with 4.1.22 you won't even need sensitivities or an optimizer to find the solution!

@joanibal
Copy link
Collaborator

joanibal commented May 16, 2024

Has this been replicated on Pleiades, which uses TOSS4?
I've been having derivative issues with my MPhys model and I'm wondering if this is the source.
I using v3.33.0 of OpenVSP but have been unable to recreate the issue using test_2 as mentioned on #70 .

Are there any workarounds, such as always using serial FD for the derivatives?

@ramcdona
Copy link

I have no idea if folks are running on Pleiades.

This original bug was reported before VSPAERO had an adjoint capability. If you're using the adjoint capability, we can be pretty sure you're seeing something entirely different.

I suggest you start a new issue report and describe the problem you're seeing and the symptoms you have. Is it a VSPAERO or OpenVSP thing, etc.

OpenVSP does not have any native derivative support. However, all parameters are subject to a minimum perturbation. Smaller than that, they don't actually force an update -- recalculation of the surface.

@joanibal
Copy link
Collaborator

Hi @ramcdona ,

Thanks for your quick response. I'm not using VSPAERO, and I suspect that my issues maybe related to how DVGeo uses parallel FD to compute the derivatives of OpenVSP. On that note, what is the minimum perturbation used to force an update? Is this something that we can disable when calling OpenVSP from python?

I should have directed my comment regarding Pleiades to @anilyil , @lamkina , and @ArshSaja since I know that they have worked on similar applications of OpenVSP to CFD-based shape optimization.

@ramcdona
Copy link

I checked into the tolerance without an update -- it is set at DBL_EPSILON, so it really shouldn't be the problem. I thought it was bigger, but I just checked and it has been DBL_EPSILON since 2.9.0 (the alpha version to 3.0.0) back in 2015 or so.

If you are using OpenVSP in parallel from the Python API, then there could be issues. These are somewhat fundamental to the way Python is designed...

When Python loads a module, that module is loaded once in memory and is shared across anything running from that Python process. So, when you import numpy in one *.py file, you are using the same numpy in some far away *.py file.

This works fine when the modules don't have any state. When you ask numpy to invert a matrix, that matrix is stored on the Python side, it is transferred to numpy, inverted in Fortran code, and the result is transferred back to the Python code.

However, this is not how OpenVSP works. Instead, OpenVSP maintains the state -- the airplane model is stored in memory on the OpenVSP / C++ side of the fence. The model can be large, so this allows only small amounts of information to be passed back and forth to Python. It also makes sense because OpenVSP was designed as an application first - not a library.

That said, imagine if NumPy kept the matrix in-memory. You loaded a large matrix in one place -- and now all other places that have imported numpy are working on the same matrix. Now your distant unrelated code is implicitly working on the same matrix -- unless you tell it to flush it and load another. But now your earlier code path is working with this secondary matrix. Chaos ensues.

We have a fix for this -- we designed the fix for a different use case (so it isn't tested for this), but it should address this problem too.

When you load the OpenVSP API, we have the ability to spawn a new Python process and load OpenVSP in that new process. We then talk to that process via sockets.

We can use this to load multiple OpenVSP models in multiple Python processes -- then each one can be queried independently. You can have one model of an airplane - and another detailed model of an engine - and a complex OpenMDAO workflow (that normally all executes on a single Python process) can now work.

If you think this would help your issues, I can help show you how to put this together (though I must warn you, I'm terrible at Python).

All that said, my first suggestion would be to not do the OpenVSP work in parallel. We're usually so much faster than whatever else is going on that you might not actually need the OpenVSP parts to be in parallel at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants