Replies: 5 comments
-
What would be the benefit of rewriting MOPAC in C? As you say, there are no longer any fundamental differences between the performance of C and Fortran, so performance considerations are not a good rationale for moving from one language to another, in either direction. Fortran 90 has pointers and the C99 standard has Fortran-array-like restricted pointers, so the basic operational capabilities of the two languages have been steadily converging. It would be a lot of work without an obvious benefit. For reference, there is a much older f2c'ed version of MOPAC available if you really want to tinker with MOPAC as C code. Also, there is a bare-bones modern C++ implementation of many of the semiempirical models (AM1, RM1, PM3, PM6) from MOPAC in Sparrow, so they too are independently C-accessible to some degree. More generally, MOPAC is a very old codebase that has largely been developed by accretion with some limited refactoring such as the addition of dynamic memory allocation and a switch from Fortran 77 to Fortran 90 coding style in the early 2000s. It was always driven by constant short-term considerations of feature deployment and usability and never really had an overarching design plan. So far, the main goal of open-source MOPAC development is to make the code easier to maintain and adapt to new hardware and to make it more accessible to people through new interfaces, over new distribution channels, and with better documentation of features and use cases. Major new features or broad redesigns/rewrites are not part of the core development plan right now. However, MOPAC is now an open-source project, so developers are welcome to contribute major new features or overhaul the code as they see fit. As long as existing functionality is maintained and contributions add some kind of value, such developments will be welcome into the main branch of the code. |
Beta Was this translation helpful? Give feedback.
-
I've moved this post from Issues to Discussions because it is not a practical short-term feature request. However, it is certainly a worthwhile topic of on-going discussion if anyone has further interest in a cost-benefit analysis of moving older software to newer programming languages and the specific example of MOPAC. |
Beta Was this translation helpful? Give feedback.
-
My 2 cents:
Summary: Instead of rewriting MOPAC in C, I see a much larger benefit to improve the integration of MOPAC in GUI programs like Avogadro or Vesta. |
Beta Was this translation helpful? Give feedback.
-
It's been a while since this topic has come up, and it's definitely something that needs to be revisited periodically. MOPAC does have a lot of frustrating spaghetti logic, and it would certainly be better if it wasn't there. I did at least remove all "arithmetic if" and "computed goto" statements from the code and have zero tolerance for them in the future. Poorly organized logic can be difficult to rewrite when it doesn't naturally map to a linear if-then-else control flow, which is often the case with MOPAC. I doubt this affects performance very much, since all of MOPAC's performance is bottlenecked by linear algebra. I do not claim to understand all of MOPAC's control flow just by looking at it, and I'm not sure if even Jimmy Stewart understands it all at this point. When I'm debugging problems, I just have to step through the program as it's running with a debugger to follow the control flow. To rewrite the control flow, a developer would need to understand it first, which isn't easy for some sections of the code. MOPAC also has serious issues with intermingling I/O and computation, but it is so deeply ingrained in the design of the code that I'm not sure it can be changed without rewriting everything. Right now, I don't think this can be even considered until testing coverage is greatly expanded, otherwise it is very likely that many features will silently break in the process. Once I finish implementing the two open pull requests and resolve a few more of the solvable open issues, my main MOPAC development activity is going to be migrating the website content of http://openmopac.net to https://openmopac.github.io. As I migrate the description of each keyword, I will finish adding tests for each keyword and feature. Only then can more significant refactoring be considered. While I have taken on MOPAC development as a professional responsibility, my main scientific interest is in developing fundamentally new semiempirical models and implementing them in an entirely new codebase, which might also be integrated with MOPAC in some capacity so as to be accessible to MOPAC's user base. This is a challenging resource management problem with maintaining old scientific software - how should development time be divided between maintaining and improving old things while developing new things? It's deeply irresponsible to let old software have its usefulness degraded by neglect, but it's also no longer science if you stop pursuing new capabilities. |
Beta Was this translation helpful? Give feedback.
-
My experience confirms your approach when disentangling spaghetties. Even with my much smaller code base and all the time going slowly and carefully, I made some logic errors, which I only found by careful testing of the routines. |
Beta Was this translation helpful? Give feedback.
-
I suggest re-writing the code solely in C instead of fortran.
Newer C standards have enabled exactly the same optimizations as Fortran has as long the coders enable them.
The point is that when you give the C function a pointer parameter,the compiler cannot know if they can point to the same object so that changing one changes another. So if the code changes an object pointed to by one pointer, the compiler can no longer assume that the value pointed to by another pointer which is loaded into the register would no longer be valid, and it needs to be re-read from RAM.
That problem particularly affects matrix computation and the like. When matrices in C are passed as pointers and without the restrict keywords, the compiler has to write virtually every intermediate result in the frame and read back when the function has more than one matrix parameter.
There are no pointers in Fortran, but reference parameters that can't be overlapped.
As said, the newer C standards allow coding so that they don't overlap each other (mainly by placing a restriction keyword in front of the pointer parameter). In other words, today's C is just as fast as Fortran.
Addition and multiplication and other mathematical functions are fast. Ram reading and writing is slow.
Beta Was this translation helpful? Give feedback.
All reactions