Rewriting the code in C instead of using fortran #43

flatstik · 2022-05-11T07:46:11Z

flatstik
May 11, 2022

I suggest re-writing the code solely in C instead of fortran.

Newer C standards have enabled exactly the same optimizations as Fortran has as long the coders enable them.
The point is that when you give the C function a pointer parameter,the compiler cannot know if they can point to the same object so that changing one changes another. So if the code changes an object pointed to by one pointer, the compiler can no longer assume that the value pointed to by another pointer which is loaded into the register would no longer be valid, and it needs to be re-read from RAM.

That problem particularly affects matrix computation and the like. When matrices in C are passed as pointers and without the restrict keywords, the compiler has to write virtually every intermediate result in the frame and read back when the function has more than one matrix parameter.

There are no pointers in Fortran, but reference parameters that can't be overlapped.
As said, the newer C standards allow coding so that they don't overlap each other (mainly by placing a restriction keyword in front of the pointer parameter). In other words, today's C is just as fast as Fortran.

Addition and multiplication and other mathematical functions are fast. Ram reading and writing is slow.

godotalgorithm · 2022-05-11T20:27:34Z

godotalgorithm
May 11, 2022
Maintainer

What would be the benefit of rewriting MOPAC in C? As you say, there are no longer any fundamental differences between the performance of C and Fortran, so performance considerations are not a good rationale for moving from one language to another, in either direction. Fortran 90 has pointers and the C99 standard has Fortran-array-like restricted pointers, so the basic operational capabilities of the two languages have been steadily converging. It would be a lot of work without an obvious benefit.

For reference, there is a much older f2c'ed version of MOPAC available if you really want to tinker with MOPAC as C code. Also, there is a bare-bones modern C++ implementation of many of the semiempirical models (AM1, RM1, PM3, PM6) from MOPAC in Sparrow, so they too are independently C-accessible to some degree.

More generally, MOPAC is a very old codebase that has largely been developed by accretion with some limited refactoring such as the addition of dynamic memory allocation and a switch from Fortran 77 to Fortran 90 coding style in the early 2000s. It was always driven by constant short-term considerations of feature deployment and usability and never really had an overarching design plan. So far, the main goal of open-source MOPAC development is to make the code easier to maintain and adapt to new hardware and to make it more accessible to people through new interfaces, over new distribution channels, and with better documentation of features and use cases. Major new features or broad redesigns/rewrites are not part of the core development plan right now. However, MOPAC is now an open-source project, so developers are welcome to contribute major new features or overhaul the code as they see fit. As long as existing functionality is maintained and contributions add some kind of value, such developments will be welcome into the main branch of the code.

0 replies

godotalgorithm · 2022-05-15T22:40:34Z

godotalgorithm
May 15, 2022
Maintainer

I've moved this post from Issues to Discussions because it is not a practical short-term feature request. However, it is certainly a worthwhile topic of on-going discussion if anyone has further interest in a cost-benefit analysis of moving older software to newer programming languages and the specific example of MOPAC.

0 replies

kamischi · 2023-06-15T06:21:32Z

kamischi
Jun 15, 2023

My 2 cents:
I worked on a much smaller legacy code in Fortran and found the following approach practical.

Replace all legacy logic constructs (if, do, goto, ...) to modern style if then else and do loops.

At least in some cases the resulting code was faster. I assume that the compiler can optimize modern style code better than old school spaghetti code.
It also lowers the entry barrier for younger coders, although this did not happen so far :-(

Strictly separate I/O from the calculations by adding a subroutine, to which all parameters and data are passed after input and before output.

Major bonus: the new subroutine(s) can be called from other languages, which are easier to use for GUIs than fortran or are preferred for any other reason.
Calling from python actually happened and was used "in production"
Creating a GUI with Delphi/Lazarus was started and worked. However, using a GUI was not really asked for by the small number of "customers".

Summary: Instead of rewriting MOPAC in C, I see a much larger benefit to improve the integration of MOPAC in GUI programs like Avogadro or Vesta.

0 replies

godotalgorithm · 2023-06-15T16:10:07Z

godotalgorithm
Jun 15, 2023
Maintainer

It's been a while since this topic has come up, and it's definitely something that needs to be revisited periodically.

MOPAC does have a lot of frustrating spaghetti logic, and it would certainly be better if it wasn't there. I did at least remove all "arithmetic if" and "computed goto" statements from the code and have zero tolerance for them in the future. Poorly organized logic can be difficult to rewrite when it doesn't naturally map to a linear if-then-else control flow, which is often the case with MOPAC. I doubt this affects performance very much, since all of MOPAC's performance is bottlenecked by linear algebra. I do not claim to understand all of MOPAC's control flow just by looking at it, and I'm not sure if even Jimmy Stewart understands it all at this point. When I'm debugging problems, I just have to step through the program as it's running with a debugger to follow the control flow. To rewrite the control flow, a developer would need to understand it first, which isn't easy for some sections of the code.

MOPAC also has serious issues with intermingling I/O and computation, but it is so deeply ingrained in the design of the code that I'm not sure it can be changed without rewriting everything. Right now, I don't think this can be even considered until testing coverage is greatly expanded, otherwise it is very likely that many features will silently break in the process. Once I finish implementing the two open pull requests and resolve a few more of the solvable open issues, my main MOPAC development activity is going to be migrating the website content of http://openmopac.net to https://openmopac.github.io. As I migrate the description of each keyword, I will finish adding tests for each keyword and feature. Only then can more significant refactoring be considered.

While I have taken on MOPAC development as a professional responsibility, my main scientific interest is in developing fundamentally new semiempirical models and implementing them in an entirely new codebase, which might also be integrated with MOPAC in some capacity so as to be accessible to MOPAC's user base. This is a challenging resource management problem with maintaining old scientific software - how should development time be divided between maintaining and improving old things while developing new things? It's deeply irresponsible to let old software have its usefulness degraded by neglect, but it's also no longer science if you stop pursuing new capabilities.

0 replies

kamischi · 2023-06-15T20:56:57Z

kamischi
Jun 15, 2023

My experience confirms your approach when disentangling spaghetties. Even with my much smaller code base and all the time going slowly and carefully, I made some logic errors, which I only found by careful testing of the routines.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewriting the code in C instead of using fortran #43

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 5 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Rewriting the code in C instead of using fortran #43

flatstik May 11, 2022

Replies: 5 comments

godotalgorithm May 11, 2022 Maintainer

godotalgorithm May 15, 2022 Maintainer

kamischi Jun 15, 2023

godotalgorithm Jun 15, 2023 Maintainer

kamischi Jun 15, 2023

flatstik
May 11, 2022

godotalgorithm
May 11, 2022
Maintainer

godotalgorithm
May 15, 2022
Maintainer

kamischi
Jun 15, 2023

godotalgorithm
Jun 15, 2023
Maintainer

kamischi
Jun 15, 2023