Skip to content

Commit bcbaf3e

Browse files
Jonathan MoussaJonathan Moussa
Jonathan Moussa
authored and
Jonathan Moussa
committed
linear-scaling paper post
1 parent 48ef216 commit bcbaf3e

6 files changed

+208
-1
lines changed
Lines changed: 151 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,151 @@
1+
---
2+
layout: post
3+
title: "Linear-scaling electronic structure: too little, too late"
4+
categories: [history,editorial]
5+
---
6+
7+
This is the first instance of a tradition that I expect to maintain as long as I am actively writing for this blog:
8+
editorializing my research papers as they are published to provide more context for why I wrote them and what they mean to me.
9+
My latest paper, ["Assessment of localized and randomized algorithms for electronic structure"](https://doi.org/10.1088/2516-1075/ab2022),
10+
just became available online with a citable DOI.
11+
Sometimes I will write about papers when they are published in a journal, such as in this instance,
12+
and other times I will write about papers shortly after I post them to [the arXiv](https://arxiv.org).
13+
14+
This particular paper is special to me for several reasons.
15+
First, I have been extremely interested in linear-scaling electronic structure algorithms since just after starting grad school in 2002.
16+
I've initiated multiple projects on the subject over the years,
17+
but they were always too risky and too ambitious to matriculate successfully into published papers.
18+
These projects have at least done a lot to shape my opinions and views on the subject,
19+
and I've left a few unpublished papers languishing on the arXiv that serve as breadcrumbs along a trail.
20+
Second, this is the first proper electronic structure paper that I've published since being forced out of the field in 2014
21+
because my funding in the area was cut.
22+
I do have [a paper from 2016](https://doi.org/10.1063/1.4965886) on rational approximations of the Fermi-Dirac function
23+
that was effectively a short prelude to the present paper, which I had originally intended to write several years ago.
24+
Before I got the boot, which I knew was coming, I wrote [a paper in 2013](https://doi.org/10.1063/1.4855255) on fast algorithms
25+
for random-phase approximation (RPA) calculations, so that I could end my active research period in electronic structure
26+
with a paper that I was very proud of.
27+
My research career hasn't exactly gone smoothly, but I feel like my electronic structure research agenda is back on track
28+
with a newfound sense of purpose.
29+
30+
My interest in linear-scaling electronic structure began with my gung-ho, bigger-and-better attitude upon starting grad school.
31+
I was very fortunate to have a solid undergraduate research experience that prepared me well for computational science
32+
research, and I was eager to do research related to atomistic simulation.
33+
[Stephan Goedecker's review](https://doi.org/10.1103/RevModPhys.71.1085) of linear-scaling electronic structure research in the 1990's was still relatively new,
34+
and I tried to learn everything about the subject by carefully going through it and following references through the literature.
35+
Ultimately, I decided on an ambitious technical approach to treat this as much as possible like a numerical linear algebra problem,
36+
since the core technical problem was to perform traces of functions of sparse-matrices while avoiding eigenvalue decompositions.
37+
Trying to follow a reductionist strategy, I focused my effort on developing a new elementary transformation for manipulating sparse matrices.
38+
The canonical problem with transforming sparse matrices is that you inevitably incur fill-in of the matrix that quickly spirals out
39+
of control and produces dense matrices long before the problem is solved.
40+
I interpreted this as the outcome of a "greedy" solver strategy, since any single elementary transformation (e.g. Given's rotation or Gaussian elimination step)
41+
was usually very simple, very inexpensive, and only appeared to make the sparsity pattern slightly worse.
42+
Instead, I performed some numerical experiments on very general but localized transformations (i.e. mostly identity)
43+
of the form
44+
45+
$$ A' \approx X^T A X $$
46+
47+
where $$A$$ is sparse and symmetric (but not positive definite), $$X$$ is the numerically optimized local transformation,
48+
and $$A'$$ is the new sparse matrix with a specified target sparsity pattern but variable matrix elements.
49+
While any one optimized sparsity-preserving transformation would be more expensive than other elementary matrix transformations,
50+
they would prevent costs from spiraling out of control as matrices would otherwise become more and more dense.
51+
These numerical experiments showed that you could diagonalize rows and columns in $$A'$$ without creating new fill-in,
52+
with errors that appeared to decay exponentially in the number of matrix elements that I allowed to vary in $$X$$.
53+
However, the numerical optimization of these transformations turned out to be very ill-conditioned as the transformations
54+
became more accurate, and there were basic arguments suggesting that this relationship was inevitable.
55+
Also, I couldn't come up with a theoretical framework to explain the numerical behavior I was seeing.
56+
I did write this up as my very first attempt to publish a research paper in grad school,
57+
but it was rejected by referees and I let it [languish on the arXiv](https://arxiv.org/abs/math/0505157)
58+
because I didn't know what else to do and received no advice on an alternate course of action.
59+
60+
My interest in linear-scaling electronic structure and numerical linear algebra did not end with that failure.
61+
I was quite committed in grad school to the linear-algebraic approach to the problem, so I decided to work on easier
62+
open problems in numerical linear algebra to prepare myself for future efforts in sparsity-preserving matrix transformations.
63+
The new problem that I focused on was a banded Hermitian eigensolver, since the tridiagonal case had been solved in the mid-1990's.
64+
I also saw this as an opportunity to mathematically formalize the concept of Wannier functions in electronic structure theory,
65+
again pursuing an agenda of treating physical concepts in a more mathematically careful manner.
66+
By the time I graduated, I had a basic plan for a banded Hermitian eigensolver and some numerical experiments demonstrating that the plan was reasonable.
67+
However, I was still in a bit over my head as there were key technical results that I had numerical evidence for but lacked an ability to prove.
68+
I finally managed to prove the most basic and tricky result in summer 2017,
69+
but it was part of a paper that stalled out as I became more and more distracted by planning my exit from Sandia.
70+
Recently, I have revisited this proof, and it will now be the main result of my very next paper (in preparation)
71+
that I will probably discuss on this blog in the not-too-distant future.
72+
I hope to finally finish developing my banded Hermitian eigensolver at some point in the next few years.
73+
74+
So, my interest in linear-scaling electronic structure ended up getting diverted into other technical pursuits.
75+
Meanwhile, there haven't been any major advances in linear-scaling electronic structure research for the entire time that I've been interested in it.
76+
I've tried to rationalize and quantify this opinion by measuring publication rates in Google Scholar:
77+
78+
![linear-scaling literature data](/assets/linear-scaling.pdf){:height="100%" width="100%"}
79+
80+
Here, I'm plotting papers mentioning ["electronic structure"] versus ["electronic structure" AND ("linear scaling" OR "scale linearly" OR "scaling linearly")],
81+
which shows that interest in linear scaling is growing with the activity in the field of electronic structure as a whole
82+
and has even become a larger fraction of papers with time.
83+
However, when this is compared against the rate of citations to three major linear-scaling electronic structure papers --
84+
[Weitao Yang's divide-and-conquer paper](https://doi.org/10.1103/PhysRevLett.66.1438) that popularized the topic,
85+
[Walter Kohn's paper](https://doi.org/10.1103/PhysRevLett.76.3168) on the nearsightedness of electrons,
86+
and [Stefan Goedecker's review paper](https://doi.org/10.1103/RevModPhys.71.1085) --
87+
it shows that the literature associated with linear-scaling solver algorithms isn't growing at all.
88+
This is mostly explained by a combination of papers superficially mentioning the concept of linear-scaling electronic structure,
89+
often noting its conceptual importance,
90+
and papers discussing other linear-scaling electronic structure algorithms not associated with solvers,
91+
such as Fock matrix construction and localized electron correlation methods in quantum chemistry.
92+
On the software side, there are no popular or widely used linear-scaling electronic structure simulation codes.
93+
Perhaps the two oldest and most established efforts are ONETEP, a commercial code related to CASTEP,
94+
and CONQUEST, which has been promising an as-yet-unfulfilled a public release for over a decade now.
95+
They each get fewer than 100 mentions in the scientific literature per year,
96+
which is far behind the most popular electronic structure codes.
97+
98+
My first linear-scaling electronic structure paper does not contain any new, notable results in numerical linear algebra.
99+
It is a much more mundane work, partly a review, partly some modest brainstorming about future solver concepts,
100+
and an attempt to compare some competing technical ideas (randomized and localized algorithms)
101+
that quite surprisingly had never been directly compared before.
102+
My younger, more idealistic self would probably have seen this as a waste of time,
103+
but I wanted to demonstrate that I am knowledgeable in this research area
104+
and try to nudge it in a slightly better direction even if I had no amazing technical breakthroughs to report.
105+
It also just grew organically from a phase that I went through while I was at Sandia of attemping to write Comments
106+
for Physical Review Letters, which was mostly just me venting frustration for my lack of research funding in electronic structure.
107+
Unsurprisingly, my targets were related to trendy research areas that had diverted money away from electronic structure at Sandia
108+
([machine learning](https://arxiv.org/abs/1208.1085) and [quantum computing](https://arxiv.org/abs/1310.6676))
109+
and [linear-scaling electronic structure itself](https://arxiv.org/abs/1311.6576).
110+
I wrote 3 Comments and got 1 published, before I decided (and was convinced by others) that this was not a productive activity.
111+
The body of scientific publications is just too large to bother pointing out errors except in the most egregious of cases.
112+
I think it was somewhat worthwhile in my case because it has helped me to shape my self-criticism
113+
(sharpening it in some areas but relaxing it overall after realizing that other, successful scientists have much laxer standards that I do)
114+
and was temporarily effective at venting frustration.
115+
It was the final Comment on stochastic linear-scaling DFT that formed the seed for my paper.
116+
While it was eventually rejected (relevance standards for Comments are very high), the editors strongly encouraged that I develop
117+
my benchmarking and analysis of different electronic structure solver algorithms into a full paper.
118+
It took 5 years to finish, since my research efforts during that time were mostly shifted to quantum computing,
119+
but I got it done in the end.
120+
I think the paper's essential message also nicely distills down to a very basic research lesson
121+
that if a field of research is unable to measure progress (in this case meaningful computational benchmarks),
122+
then it will not be able to make progress.
123+
124+
Because I mostly work alone on research (although I would love to collaborate with like-minded people),
125+
I have found that I am most productive when I successfully "flatten" my research plans
126+
into an ordered sequence of projects and papers with a clear and self-convincing rationale for the ordering
127+
(i.e. inter-dependence and relative importance).
128+
I'd then like to end this post with a preview for my next paper,
129+
which I've been actively writing for about a month now, and some related concluding thoughts.
130+
In [Goedecker's review](https://doi.org/10.1103/RevModPhys.71.1085) of the active period (1991-1999) of linear-scaling electronic structure research,
131+
he concluded that
132+
133+
> O(N) methods have become an essential part of most large-scale atomistic simulations based on either tight-binding or semiempirical methods.
134+
135+
However, that opinion was not reflected in any capability or feature of any simulation code that was available at that time,
136+
and linear-scaling solvers still have very limited applicability and availability 20 years later.
137+
Goedecker was certainly correct that the available algorithmic concepts were more readily applicable to tight-binding/semiempirical models,
138+
but he assumed it to be an inevitable foregone conclusion that simply didn't ever happen
139+
(which is perhaps a result of *everyone else* thinking that way, too).
140+
I now very much want to make this happen,
141+
and most of my planned papers are now organized around key technical components of the semiempirical simulation software that I have envisioned.
142+
While most of these papers will be narrowly focused on this goal,
143+
my very next paper is a little different.
144+
As I mentioned in this post, I have a long-overdue theorem/proof that I am finally preparing for publication.
145+
While the theorem/proof is about a rather esoteric problem, low-rank approximations of the Cauchy kernel,
146+
it is directly related to thoroughly optimizing and finalizing
147+
the function approximations that will be at the heart of future linear-scaling electronic structure solvers.
148+
It also encompasses and concludes my [earlier effort](https://doi.org/10.1063/1.4965886) to numerically optimize rational approximations of
149+
the Fermi-Dirac function into a slightly larger and mostly analytical approximation framework.
150+
As I work through this next project, I am going to experiment with a radically open research style
151+
and present unpublished work in progress on this blog.

assets/DFT-vs-SQM.pdf

6 Bytes
Binary file not shown.

assets/DFT-vs-SQM.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@
1616
plt.semilogy(year, sqm_mopac, color = 'red', ls=':',label='SQM (MOPAC)')
1717
plt.title('scientific publications: DFT vs. semiempirical quantum mechanics (SQM)')
1818
plt.xlabel('year')
19-
plt.ylabel('# of publications')
19+
plt.ylabel('# of publications per year')
2020
plt.legend()
2121

2222
plt.savefig("DFT-vs-SQM.pdf",bbox_inches='tight',pad_inches=0.01)

assets/linear-scaling.pdf

15.2 KB
Binary file not shown.

assets/linear-scaling.py

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
import numpy as np
2+
import matplotlib.pyplot as plt
3+
4+
year, s1, s2, s3, s4, s5 = np.loadtxt("linear-scaling.txt", comments="#", unpack=True)
5+
6+
plt.figure(figsize=(8,4))
7+
8+
plt.ylim(1,1e5)
9+
plt.xlim(1990,2020)
10+
11+
plt.semilogy(year, s1, color = 'green',label='Yang\'s 1991 paper citation')
12+
plt.semilogy(year, s3, color = 'blue',label='Kohn\'s 1996 paper citation')
13+
plt.semilogy(year, s2, color = 'orange',label='Goedecker\'s 1999 paper citation')
14+
plt.semilogy(year, s4, color = 'red',label='electronic structure papers mentioning linear scaling')
15+
plt.semilogy(year, s5, color = 'black',label='electronic structure papers')
16+
plt.title('linear-scaling electronic structure publication data')
17+
plt.xlabel('year')
18+
plt.ylabel('# of publications per year')
19+
plt.legend()
20+
21+
plt.savefig("linear-scaling.pdf",bbox_inches='tight',pad_inches=0.01)

assets/linear-scaling.txt

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
# SEARCH 1: citation of "Direct calculation of electron density in density-functional theory" (first paper)
2+
# SEARCH 2: citation of "Linear scaling electronic structure methods" (last paper)
3+
# SEARCH 3: citation of "Density functional and density matrix method scaling linearly with the number of atoms" (famous paper)
4+
# SEARCH 4: "electronic structure" AND ("linear scaling" OR "scale linearly" OR "scaling linearly")
5+
# SEARCH 5: "electronic structure"
6+
7+
1990 0 0 0 28 8710
8+
1991 2 0 0 33 9430
9+
1992 9 0 0 33 9930
10+
1993 17 0 0 48 10800
11+
1994 15 0 0 63 10700
12+
1995 28 0 0 82 11000
13+
1996 29 0 4 151 13300
14+
1997 26 0 17 157 13600
15+
1998 32 0 19 185 14300
16+
1999 23 7 22 160 15700
17+
2000 17 50 26 268 17700
18+
2001 21 66 27 297 19500
19+
2002 25 63 24 368 21200
20+
2003 50 88 26 371 23000
21+
2004 32 78 29 451 27800
22+
2005 29 81 29 580 31900
23+
2006 53 96 41 715 39000
24+
2007 44 76 50 745 40200
25+
2008 56 82 47 907 49500
26+
2009 56 74 28 968 53200
27+
2010 47 57 41 1110 52500
28+
2011 47 61 39 1270 67400
29+
2012 50 62 53 1170 73000
30+
2013 53 56 32 1430 71500
31+
2014 49 50 47 1590 73800
32+
2015 39 61 37 1720 71700
33+
2016 47 69 62 1680 65700
34+
2017 37 61 41 1890 58300
35+
2018 45 52 48 2170 53700

0 commit comments

Comments
 (0)