|
| 1 | +--- |
| 2 | +layout: post |
| 3 | +title: "The original sin of semiempirical electronic structure" |
| 4 | +categories: research |
| 5 | +--- |
| 6 | + |
| 7 | +Error analysis is an important aspect of my research. |
| 8 | +First and foremost, I strongly believe in introspection and self-criticism as important traits of a scientist. |
| 9 | +I don't think scientists should rely on others to provide the most useful criticisms of their work - |
| 10 | + they should be able to identify flaws in their own work and make appropriate adjustments. |
| 11 | +Whatever mistakes you fail to identify are likely to persist in some form, possibly limiting the efficacy of your work |
| 12 | + in some subtle way or perhaps being identified by competitors long after the fact to be used against you. |
| 13 | +I also have no shortage of research ideas, so I end up in cyclic process of coming up with ideas, |
| 14 | + critically assessing them until they (usually) fall apart, learn some valuable lesson, |
| 15 | + and then proceed to churn out more ideas informed by such lessons. |
| 16 | +This does tend to waste a lot of time, and it has probably been detrimental to my research career so far |
| 17 | + (if anything, this kind of behavior is more an investment in the future than a prudent short-term strategy). |
| 18 | +Also, when I begin to work in the vicinity of well-established ideas and results, |
| 19 | + I will often target my engine of criticism onto the work of others, |
| 20 | + so that I can put their work through sufficient criticism for me to accept or trust it in some form. |
| 21 | +Too often, scientists take criticism of their work as a personal insult, but I try my best to be polite about it. |
| 22 | + |
| 23 | +As per my last blog post, I've begun to build some very simple semiempirical electronic structure models |
| 24 | + to gain some hands-on experience (and to co-opt other work responsibilities). |
| 25 | +I've already done a lot of background research in this area, so I was prepared to put the problems that I encountered |
| 26 | + into an informed historic perspective. |
| 27 | +I put together a model inspired by the popular modified neglect of diatomic overlap (MNDO) approximations, |
| 28 | + and I encountered one of the basic, well-known problems that result from them: a lack of repulsion between closed-shell atoms/molecules. |
| 29 | +I avoided using the historic correction of a short-ranged repulsive interatomic potential because |
| 30 | + I was trying to show how a simple quantum-mechanical model of noble gases produces a Lennard-Jones-like potential |
| 31 | + for the "correct" physical reasons. |
| 32 | +I was admitting failure if ended up with a model that already contained the repulsive half of a Lennard-Jones potential. |
| 33 | +Instead, I put in an extra short-ranged repulsive electron-ion potential inspired by pseudopotentials, |
| 34 | + which achieved the same outcome but at least appeared to be "more quantum". |
| 35 | +However, I didn't have time for enough analysis to really understand the root causes of this problem, |
| 36 | + as there are multiple possible culprits. |
| 37 | +There is a [thorough recent assessment](https://doi.org/10.1021/acs.jctc.8b00601) |
| 38 | + on the neglect of differential diatomic overlap (NDDO) approximation |
| 39 | + of the 2-body Coulomb integrals in MNDO-based models, |
| 40 | + while the [main effort](https://doi.org/10.1021/acs.jctc.5b01046) to repair the underlying MNDO flaws |
| 41 | + has focused on the effects of atomic-orbital orthogonalization. |
| 42 | +If I want to build new semiempirical models and avoid past mistakes, |
| 43 | + then I need to understand and the leading sources of error and their relative importance. |
| 44 | + |
| 45 | +Since I just learned how to use Jupyter notebooks and had a positive experience with them, |
| 46 | + I decided to [carry out my MNDO analysis in a Jupyter notebook](/assets/2019-08-06-MNDO-He-dimer.ipynb). |
| 47 | +In retrospect, there was enough of an emphasis on analytical manipulations (that I did by hand) |
| 48 | + over numerical computations that it would have probably been easier to use a Mathematica notebook here. |
| 49 | +However, most of my work is heavily numerical, so I'll probably stick with Jupyter notebooks and the conveniences of NumPy. |
| 50 | +The simple example that I used for this analysis was a Hartree-Fock calculation of a helium dimer |
| 51 | + with a single-Gaussian atomic orbital basis for each atom. |
| 52 | +This is of course a very crude description of atomic effects, |
| 53 | + but it was able to preserve interatomic effects reasonably well |
| 54 | + while enabling the entire calculation to be done by hand without too much effort. |
| 55 | +I was a little bit surprised by the outcome at first, but it wasn't so surprising in hindsight. |
| 56 | +In the absence of covalent bonding, NDDO errors were small and irrelevant. |
| 57 | +The problem was specifically that the on-site 1-body matrix element (excluding the electron-ion Coulomb interaction) |
| 58 | + was a constant and did not depend on interatomic separation. |
| 59 | +This is appropriate for a non-orthogonal basis, |
| 60 | + but the orthogonalization process induces an effective short-ranged repulsive electron-ion interaction |
| 61 | + that overwhelms the long-ranged attractive Coulomb interaction when the orbitals of two atoms begin to overlap appreciably. |
| 62 | +By introducing this repulsion as an interatomic potential rather than an electron-ion potential, |
| 63 | + MNDO-based models suppressed both the occupation-dependence of this energy contribution |
| 64 | + and its effect on the electronic structure. |
| 65 | +This is a sub-optimal design choice that in effect makes an additional approximation of a fixed average electronic occupation of atoms |
| 66 | + in some parts of the model, thus causing problems for atoms that can occur in multiple charge states. |
| 67 | +However, it doesn't cause any catastrophic problems, and the semiempirical modeling process is flexible enough |
| 68 | + to smooth over the effects of various approximations. |
| 69 | +Unfortunately, this limits the transferability of models between different atomic charge states |
| 70 | + and exhausts more of the model parameters in correcting avoidable errors. |
| 71 | + |
| 72 | +This ambiguity between ion-ion and electron-ion potentials has persisted to the present day |
| 73 | + in popular semiempirical electronic structure models. |
| 74 | +The direct decendents of MNDO (AM1, PMx, etc.) continue to use repulsive interatomic potentials |
| 75 | + and continue to increase their complexity with an increasing number of diatomic model parameters. |
| 76 | +The completely independent DFTB family of semiempirical models, the most recent variant of which is [GFN2-xTB](https://doi.org/10.1021/acs.jctc.8b01176) |
| 77 | + perform their electronic structure calculations in a non-orthogonal basis, |
| 78 | + which avoids the approximations that caused the lack of repulsion in MNDO-based models. |
| 79 | +However, DFTB models still include a short-ranged interatomic repulsion between neutral reference atoms |
| 80 | + even though they have a flexible enough form to capture this correctly as a purely electronic contribution to the total energy. |
| 81 | +Because these models are primarily fit to total energy data (or similiarly heats of formation or total energy differences or structural properties) |
| 82 | + rather than electronic properties like excitation spectra, |
| 83 | + the more complicated model forms begin to develop multiple ways of approximately fitting the same data. |
| 84 | +This creates a poorly conditioned parameter fitting process, |
| 85 | + where there are nearly redundant parameters with respect to the reference data that is being fit. |
| 86 | +It isn't simply a matter of not having enough data, it is that the combination of limited scope |
| 87 | + and flexible model forms has produced fundamentally underconstrained models |
| 88 | + that do not attempt to predict diverse enough physical observables to define themselves unambiguously. |
| 89 | + |
| 90 | +To conclude this discussion, the "original sin" that I teased in the title of this blog post |
| 91 | + is the limited scope of semiempirical electronic structure models that limits their ability to define model parameters robustly. |
| 92 | +They are models of electrons and ions that have not constrained their electronic subsystems with a broad enough set of electronic properties. |
| 93 | +The ion-ion/electron-ion potential ambiguity was simply the first and most consequential symptom of this underlying problem. |
| 94 | +If these models were also constrained to predict atomic charges (which unfortunately need to be artificially defined in some way), |
| 95 | + then they would have more rigorously constrained the electron-ion potential. |
| 96 | +This is analogous to Kohn-Sham density functional theory (DFT), where the modeling of the electronic density |
| 97 | + formally constrains the electronic potential in a [1-to-1 mapping](https://doi.org/10.1103/PhysRev.140.A1133). |
| 98 | +Similarly, pure interatomic potential/energy models are simply functions of atomic coordinates |
| 99 | + and well-behaved functional forms can be systematically increased in complexity as more reference data becomes available to constrain them. |
| 100 | +As a result, both DFT and interatomic potentials have been better defined and overall more successful modeling activities |
| 101 | + than semiempirical quantum chemistry in recent decades. |
| 102 | +Even some [notable missteps](https://doi.org/10.1126/science.aah5975) reinforce my point: |
| 103 | + DFT reference data has been emphasizing total energies much more than electronic densities, |
| 104 | + and the accuracy of electronic density predictions has begun to degrade in some models as a result. |
| 105 | +Semiempirical electronic structure models need more constraints to remain well-defined as they introduce more parameters, |
| 106 | + either from direct connections to ab-initio quantum chemistry or from predicting a broader set of physical observables |
| 107 | + (even if they are less capable of predicting those observables accurately). |
| 108 | +Two examples of the former approach are the [OMx models](https://doi.org/10.1021/acs.jctc.5b01046) |
| 109 | + and Michael Dewar's last model, [SAM1](https://doi.org/10.1016/S0040-4020(01)81868-8), |
| 110 | + that both obtain more of their matrix elements directly from ab-initio quantum chemistry. |
| 111 | + |
| 112 | +My research plans in semiempirical electronic structure are evolving, as I explore and learn more and as circumstances change. |
| 113 | +I will discuss these plans and the thought process behind them in an upcoming blog post. |
| 114 | +In particular, I'm trying to stake out some practical short-term research projects to add value to semiempirical electronic structure |
| 115 | + as soon as possible to gain more clout in this research area while isolating, deferring, |
| 116 | + and organizing my more ambitious, risky, and technically sophisticated ideas into a longer-term research plan. |
| 117 | +Without some tangible short-term successes, this research effort probably won't survive long enough to have a "long term". |
0 commit comments