Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cleanup, organization, and modularization #202

Open
Foadsf opened this issue Mar 16, 2020 · 3 comments
Open

cleanup, organization, and modularization #202

Foadsf opened this issue Mar 16, 2020 · 3 comments
Assignees

Comments

@Foadsf
Copy link
Contributor

Foadsf commented Mar 16, 2020

I took a look at Elmer's codebase, and I should say, to be frank, it is not well organized at all. It seems more of a code dump written and patched over the years without proper organization. Many of the folders don't have README files and the ones which do, sometimes do not offer proper information about the content of that folder. Plain text has also been used instead of modern MarkUp languages such as MarkDown, AsciiDoc, or reStructuredText ...

I think the codebase should be split into several different repositories under the ElmerCSC GitHub organization:

  • Folders ElmerGUI, ElmerGUIlogger, and ElmerGUItester should be in one repository called ElmerGUI and anything other than the GUI stuff should be scrapped out. I think there are some meshing modules sin the ElmerGUI which hasn't been integrated into the ElmerGrid for example, but I might be wrong on this.

  • The folder elmergrid should obviously be an independent repo named ElmerGrid responsible for meshing the input geometries in different formats.

  • The fem which is actually the ElmerSolver should be an independent repo called the same name. I think the fhutiter folder containing the HUTIter iterative solver should also be placed in the same folder.

  • The docker related files in ci and docker folders should be separated just like the HomeBrew which is independent already.

  • There are two versions of MATC one in root matc and the other at post/matc where the later is specifically for MVS compilers. This seems bad practice to me. any module should be developed in one single place and we take care of the cross-platform ness in a canonical way.

  • the post folder can be taken out and deprecated. It is important this package is maintained so the people who just start with the software can follow the existing tutorials. we can warn them whenever they open the app that they should use decent alternatives like ParaView instead.

  • the elmerice folder should also be modularized. A great portion of Elmer users have no idea what it is and might never use it, though it is funding wise a big player in the project.

  • I see that there are third party libraries such as contrib/Zoltan, umfpack, and even contrib/lua embedded into the project. This is in my humble opinion again a bad practice. We should use a decent package manager like Conan, to list all the dependencies and if any of them are not available write Cmake or even bash/batch scripts to download and locate and link against them properly.

  • we need to have a unified folder structure for any of the above repositories probably following some decent Cmake templates.

  • Other folders seem to be some snippets or macros for specific purposes (e.g., import from ANSYS) which could be other repose and partly added to he Elmer Documentation.

  • Another concerning fact is that we have so many parallel branches. This is not a good practice again as it is not clear who is doing what, where and why. Branches should be made for specific bug fixes, features, persons, and they must have a clear lifetime. They should not coexist in parallel and they must be merged to the master ASAP.

Qxb9ABXnml

  • Binary files should not exist in a repository as much as possible. I see that in some folders there are archived files .gz format or something.
  • I see sometimes original netlib BLAS and LACPACK libraries have been used as dependencies. I think they need to be replaced with OpenBLAS.
@raback
Copy link
Contributor

raback commented Mar 17, 2020

I agree in large extent on the observations. However, all reorganization is not as easy and often falls down on resources. Also one main concern on reorganization is indeed the fact that are many branches that should be merged in the future. In ideal world branches would indeed be short-lived.

For me personally this does not always happen since some feature turns out to be take more work than anticipated and is left there there for longer times. However, there are also huge development branches related to novel developments in the size of a PhD thesis. Ideally these would also be frequently merged. However, I do have certain understanding why not to merge. These are often primarily research projects on the bleeding edge rather than software development projects. Still it would of course be good to merge at least devel to the feature branch frequently. I think that if there are conflicts many researchers rather opt not to mess with their branch and rather finish the next paper etc.

Also the cmake build system causes some inertia. If you reorganize the code tree you also need to address the cmake files ect. As an afterthought it would probably have been better to make the cmake very modular. This would also have helped in the compilation issues.

The bundled codes is also an issue. For example, Zoltan, Arpack and Lua have been edited and the code is not the same. At the time when they were included there were no readily available packages in all the platforms needed. At least there is a cmake, but there might be something else too. Lapack is only used if blas libraries are not present.

All this said, it would be easy to reorganize the stuff without dependencies. Take ElmerPost and ElmerParam out, maybe reorganize some directories etc. and rewrite the Readme's. I'm hesitant to put a lot of effort to fix things that are not really broken but just look bad from today's perspective. Of course, would there be a rapture in package management it would make sense to consider many of these choices.

@ReedyBear
Copy link

I'll sum up my thoughts from a twitter thread.

  • I'm not a C developer or familiar with any of these package managers
  • I think git submodules could be a low barrier for starting to handle dependencies, if the only repo needing to declare dependencies is this one. This would require downloading the code & creating unofficial releases (such as http://glaros.dtc.umn.edu/gkhome/metis/metis/download).
  • Contacting the original package developers (such as METIS) and discussing it with them might be good.

@ReedyBear
Copy link

After catching up on this discussion, here are my thoughts:

  • Clearly define the reason for switching to the better design. There is going to be a cost of dev time. Will future development be more productive as a result? Does the overall benefit justify the cost?
    • I personally love a well-designed system and find a poorly structured repo keeps me from contributing. Even when my own code is poorly structured, I end up re-writing things because I can't figure out what's going on.
  • You raise a lot of good concerns with this issue. Perhaps most of your bullet points should be their own issues, so it can be a set of independent tasks rather than one gigantic issue. This might allow incremental improvements with more focused discussions about specific aspects.
  • I would start by getting this repo organized without changing anything dependency-wise. Just moving folders around, updating your cmake templates, and getting it to build properly.
  • I'd make a new branch devel_refactor and do this work on that, then post a notice on your main branch devel far ahead of time that you're working on a new major refactor with potentially breaking changes. Subsequently, I'd have devel_refactor_directories, devel_refactor_elmergui-dependency, etc. I like prefixes. Much easier to follow.
  • At some point, any major changes like this are going to sure to create big headaches for people, so the benefit should be very worth it. It also might be worth posting a link to these discussions on the README so others (who don't look at issues) are more likely to chime in. You might also PIN this issue, so it's at the top of your issues list. There might be a better way to carry on these discussions, too, but moving people out of Github means maybe fewer people would engage with the discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants