Blog

Business Card Lennard–Jones Simulation, Explained

November 28, 2022

Last week, I posted a simple Lennard–Jones simulation, written in C++, that models the behavior of liquid Ar in only 1561 characters. By popular request, I'm now posting a breakdown/explanation of this program, both to explain the underlying algorithm and illustrate how it can be compressed.

Here's the most current version of the program, now condensed even further to 1295 characters:

#include <iostream>
#include <random>
#define H return
typedef double d;typedef int i;using namespace std;i N=860;d B=8.314e-7;d S=1544
.8;d q(d x){if(x>17)x-=34;if(x<-17)x+=34;H x;}d rd(){H 34*drand48()-17;}struct v
{d x,y,z;v(d a=0,d b=0,d c=0):x(a),y(b),z(c){}d n(){H x*x+y*y+z*z;}v p(){H v(q(x
),q(y),q(z));}};v operator+(v a,v b){H v(a.x+b.x,a.y+b.y,a.z+b.z);}v operator*(d
 a,v b){H v(a*b.x,a*b.y,a*b.z);}struct p{v r,q,a,b;vector<i> W;p(v l):r(l){}void
 F(v f){b=0.025*f+b;}};vector<p> P;void w(){cout<<N<<"\n\n";for(p l:P)cout<<"Ar
"<<l.r.x<<" "<<l.r.y<<" "<<l.r.z<<"\n";cout<<"\n";}void E(){for(i j=N;j--;){for(
i k:P[j].W){v R=(P[j].r+-1*P[k].r).p();d r=R.n();if(r<70){d O=r*r*r;R=2880*B*(2*
S*S/O/O-S/O)/r*R;P[j].F(R);P[k].F(-1*R);}}}}void e(){for(i j=0;j<N;j++){P[j].W.c
lear();for(i k=j+1;k<N;k++)if((P[j].r+-1*P[k].r).p().n()<90)P[j].W.push_back(k);
}}i main(){i A=1e3;i Y=5e3;for(i j=N;j--;){for(i a=999;a--;){v r=v(rd(),rd(),rd(
));i c=0;for(p X:P){d D=(r+-1*X.r).p().n();if(D<6.8)c=1;}if(!c){P.push_back(p(r)
);break;}}}for(i t=0;t<=3e4;t++){for(p& I:P)I.r=(I.r+I.q+0.5*I.a).p();if(t%20==0
)e();E();d K=0;for(p& I:P){I.q=I.q+0.5*(I.b+I.a);I.a=I.b;I.b=v();K+=20*I.q.n();}
d T=2*K/(3*B*N);if(t<2*Y){d C=75;if(t<Y)C=75*t/Y+(A-75)*(Y-t)/Y;for(p& I:P)I.q=s
qrt(C/T)*I.q;}if(t%100==0&&t>2*Y)w();}}

Let's add some whitespace and break this down, line by line:

#include <iostream>
#include <random>

// some abbreviations
#define H return
typedef double d;
typedef int i;
using namespace std;

// constants
i N=860;        // number of particles
d B=8.314e-7;   // boltzmann's constant
d S=1544.8;     // the minimum of the Lennard-Jones potential, 3.4 Å, to the 6th power

The program begins by importing the necessary packages, defining abbreviations, and declaring some constants. Redefining double and int saves a ton of space, as we'll see. (This is standard practice in the code abbreviation world.)

// given a 1d coordinate, scale it to within [-17,17].
// (this only works for numbers within [-51,51] but that's fine for this application)
d q(d x){
    if(x>17)
        x-=34;
    if(x<-17)
        x+=34;
    H x;
}

// returns a uniform random number in [-17,17]
d rd(){
    H 34*drand48()-17;
}

We now define a helper function to keep coordinates within the boundaries of our cubical box, and create another function to "randomly" initialize particles' positions. (drand48 is not a particularly good random-number generator, but it has a short name and works well enough.)

// vector class
struct v{
    d x,y,z;
    v(d a=0,d b=0,d c=0):x(a),y(b),z(c){}

    // return the squared length of the vector
    d n(){
        H x*x+y*y+z*z;
    }
   
    // return a vector with periodic boundary conditions applied
    v p(){
        H v(q(x),q(y),q(z));
    }
};

// vector addition
v operator+(v a,v b){
    H v(a.x+b.x,a.y+b.y,a.z+b.z);
}

// multiplication by a scalar
v operator*(d a,v b){
    H v(a*b.x,a*b.y,a*b.z);
}

Here, we define a vector class to store positions, velocities, and accelerations, and define addition and multiplication by a scalar. (It would be better to pass const references to the operators, but it takes too many characters.)

// particle class
struct p{
    // r is position, q is velocity, a is acceleration, b is acceleration in the next timestep
    v r,q,a,b;
    // neighbor list, the list of particles close enough to consider computing interactions with
    vector<i> W;

    p(v l):r(l){}
   
    // apply a force to this particle. 
    // 0.025 = 1/40 = 1/(Ar mass)
    void F(v f){
        b=0.025*f+b;
    }
};

// global vector of all particles
vector<p> P;

// write current coordinates to stdout
void w(){
    cout<<N<<"\n\n";
    for(p l:P)
        cout<<"Ar "<<l.r.x<<" "<<l.r.y<<" "<<l.r.z<<"\n";
    cout<<"\n";
}

Now, we define a class that represents a single argon atom. Each atom has an associated position, velocity, and acceleration, as well as b, which accumulates acceleration for the next timestep. Atoms also have a "neighbor list", or a list of all the particles close enough to be considered in force calculations. (To prevent double counting, each neighbor list only contains the particles with index larger than the current particle's index.)

We create a global variable to store all of the particles, and create a function to report the current state of this variable.

// compute forces between all particles
void E(){
    for(i j=N;j--;){
        for(i k:P[j].W){
            // compute distance between particles
            v R=(P[j].r+-1*P[k].r).p();
            d r=R.n();

            // if squared distance less than 70 (approximately 6 * 3.4Å**2), the interaction will be non-negligible
            if(r<70){
                d O=r*r*r;
                // this is the expression for the lennard–jones force.
                // the second lennard–jones parameter, the depth of the potential well (120 kB), is factored in here.
                R=2880*B*(2*S*S/O/O-S/O)/r*R;

                // apply force to each particle
                P[j].F(R);
                P[k].F(-1*R);
            }
        }
    }
}

Now, we create a function to calculate the forces between all pairs of particles. For each particle, we loop over the neighbor list and see if the distance is within six minima of the adjacent particle, using squared distance to avoid the expensive square-root calculation. If so, we calculate the force and apply it to each particle.

// build neighbor lists
void e(){
    for(i j=0;j<N;j++){
        // clear the old lists
        P[j].W.clear();
        for(i k=j+1;k<N;k++)
            // if squared distance between particles less than 90 (e.g. close to above cutoff), add to neighbor list
            if((P[j].r+-1*P[k].r).p().n()<90)
                P[j].W.push_back(k);
    }
}

Finally, we create a function to build the neighbor lists, which is a straightforward double loop. We add every particle which might conceivably be close enough to factor into the force calculations within 10–20 frames.

i main(){
    i A=1e3;    // initial temperature (1000 K)
    i Y=5e3;    // time to reach final temperature (5000 fs)

    // initialize the system. each particle will be randomly placed until it isn't too close to other particles.
    for(i j=N;j--;){
        for(i a=999;a--;){
            // generate random position
            v r=v(rd(),rd(),rd());
            i c=0;

            // check for clashes with each extant particle
            for(p X:P){
                d D=(r+-1*X.r).p().n();
                if(D<6.8)
                    c=1;
            }

            // if no clashes, add particle to list
            if(!c){
                P.push_back(p(r));
                break;
            }
        }
    }

To begin the program, we randomly initialize each particle and test if we're too close to any other particles. If not, we save the position and move on to the next particle. This is crude but works well enough for such a simple system.

    // run MD! this is basically just the velocity verlet algorithm.
    for(i t=0;t<=3e4;t++){
        // update position
        for(p& I:P)
            I.r=(I.r+I.q+0.5*I.a).p();
        
        // every 20 timesteps (20 fs), update neighbor lists
        if(t%20==0)
            e();

        // compute forces
        E();

        // finish velocity verlet, and sum up the kinetic energy.
        d K=0;  // kinetic energy
        for(p& I:P){
            I.q=I.q+0.5*(I.b+I.a);
            I.a=I.b;
            I.b=v();
            K+=20*I.q.n();
        }
        
        d T=2*K/(3*B*N); // temperature

        // in the first 10 ps, apply berendsen thermostat to control temperature
        if(t<2*Y){
            d C=75; // target temperature
            if(t<Y)
                C=75*t/Y+(A-75)*(Y-t)/Y;
            for(p& I:P)
                I.q=sqrt(C/T)*I.q;
        }
       
        // every 100 fs after the first 10 ps, write the configuration to stdout
        if(t%100==0&&t>2*Y)
            w();
    }
}

Finally, we simply have to run the simulation. We use the velocity Verlet algorithm with a 1 fs timestep, updating neighbor lists and writing to stdout periodically. The temperature is gradually lowered from 1000 K to 75 K over 5 ps, and temperature is controlled for the first 10 ps.

Hopefully this helps to shed some light on how simple a molecular dynamics simulation can be, and highlights the wonders of obfuscated C++!

Business Card Lennard–Jones Simulation

November 21, 2022

An (in)famous code challenge in computer graphics is to write a complete ray tracer small enough to fit onto a business card. I've really enjoyed reading through some of the submissions over the years (e.g. 1, 2, 3, 4), and I've wondered what a chemistry-specific equivalent might be.

As a first step in this space—and as a learning exercise for myself as I try to learn C++—I decided to try and write a tiny Lennard–Jones simulation. Unlike most molecular simulation methods, which rely on heavily parameterized forcefields or complex quantum chemistry algorithms, the Lennard–Jones potential has a simple functional form and accurately models noble gas systems. Nice work from Rahman in 1964 showed that Lennard–Jones simulations of liquid argon could reproduce experimental X-ray scattering results for the radial distribution function:

Figure 2 from Rahman, showing the radial distribution function of liquid argon with clear first, second, and third solvation shells.

This thus seemed like a good target for a "tiny chemistry" program: easy enough to be doable, but tricky enough to be interesting. After a bit of development, I was able to get my final program down to 1561 characters, easily small enough for a business card. The final program implements periodic boundary conditions, random initialization, and temperature control with the Berendsen thermostat for the first 10 picoseconds. Neighbor lists are used to reduce the computational cost, and an xyz file is written to stdout.

#include <iostream>
#include <random>
#define H return
typedef double d;typedef int i;using namespace std;d L=17;i N=860;d B=8.314e-7;d
 s=3.4;d M=6*s*s;d S=s*s*s*s*s*s;d q(d x){if(x>L)x-=2*L;if(x<-L)x+=2*L;H x;}d rd
(){H 2*L*drand48()-L;}struct v{d x,y,z;v(d a=0,d b=0,d c=0):x(a),y(b),z(c){}d n(
){H x*x+y*y+z*z;}v p(){x=q(x);y=q(y);z=q(z);H*this;}};v operator+(v a,v b){H v(a
.x+b.x,a.y+b.y,a.z+b.z);}v operator*(d a,v b){H v(a*b.x,a*b.y,a*b.z);}struct p{d
 m;v r,q,a,b;vector<i> W;p(v l,d n=40):r(l),m(n),q(v()),a(v()),b(v()),W(vector<i
>()){}void F(v f){b=(1/m)*f+b;}};vector<p> P;void w(){cout<<N<<"\n\n";for(i j=N;
j--;){v l=P[j].r;cout<<"Ar "<<l.x<<" "<<l.y<<" "<<l.z<<"\n";}cout<<"\n";}void E(
){for(i j=0;j<N;j++){for(i x=0;x<P[j].W.size();x++){i k=P[j].W[x];v R=(P[j].r+-1
*P[k].r).p();d r2=R.n();if(r2<M){d O=1/(r2*r2*r2);d f=2880*B*(2*S*S*O*O-S*O)/r2;
R=f*R;P[j].F(R);P[k].F(-1*R);}}}}void cW(){for(i j=0;j<N;j++){P[j].W.clear();for
(i k=j+1;k<N;k++)if((P[j].r+-1*P[k].r).p().n()<1.3*M)P[j].W.push_back(k);}}i mai
n(){i A=1e3;i e=75;i Y=5e3;for(i j=N;j--;){for(i a=99;a--;){v r=v(rd(),rd(),rd()
);i c=0;for(i k=P.size();k--;){d D=(r+-1*P[k].r).p().n();if(D<2*s)c=1;}if(!c){P.
push_back(p(r));break;}}}if(P.size()!=N)H 1;for(i t=0;t<=3e4;t+=10){for(i j=N;j-
-;){P[j].r=P[j].r+P[j].q+0.5*P[j].a;P[j].r.p();}if(t%20==0)cW();E();d K=0;for(i
j=N;j--;){P[j].q=P[j].q+0.5*(P[j].b+P[j].a);P[j].a=P[j].b;P[j].b=v();K+=P[j].m*P
[j].q.n()/2;}d T=2*K/(3*B*N);if(t<2*Y){d C=e;if(t<Y)C=e*t/Y+(A-e)*(Y-t)/Y;d s=sq
rt(C/T);for(i j=N;j--;)P[j].q=s*P[j].q;}if(t%100==0&&t>2*Y)w();}}

The program can be compiled with g++ -std=gnu++17 -O3 -o mini-md mini-md.cc and run:

$ time ./mini-md > Ar.xyz

real	0m4.689s
user	0m4.666s
sys	0m0.014s

Analysis of the resulting file in cctk results in the following radial distribution function, in good agreement with the literature:

Radial distribution function generated by the above code.

I'm pretty pleased with how this turned out, and I learned a good deal both about the details of C++ syntax and about molecular dynamics. There are probably ways to make this shorter or faster; if you write a better version, let me know and I'll happily link to it here!

Structural Diversity in Science

October 26, 2022

Michael Nielsen and Kanjun Qiu recently published a massive essay entitled “A Vision of Metascience: An Engine of Improvement for the Social Processes of Science.” Metascience, the central topic of the essay, is science about science. As the authors put it, metascience “overlaps and draws upon many well-established fields, including the philosophy, history, and sociology of science, as well as newer fields such as the economics of science funding, the science of science, science policy, and others.”

The essay emphasizes how there are likely massive potential improvements to what they call the “social processes” of science, or the ways in which science is practiced by scientists. There are a lot of important ideas and conclusions in the essay, but here I want to focus on one specific theme that caught my attention: the importance of structural diversity in driving scientific progress. In this context, structural diversity means “the presence of many differently structured groups in science.” High structural diversity means many different types of scientific groups, while low structural diversity means only a few different types of scientific groups. To quote Nielsen and Qiu directly:

…structural diversity is a core, precious resource for science, a resource enlarging the range of problems humanity can successfully attack. The reason is that different ambient environments enable different kinds of work. Problems easily soluble in one environment may be near insoluble in another; and vice versa. Indeed, often we don't a priori know what environment would best enable an attack on an important problem. Thus it's important to ensure many very different environments are available, and to enable scientists to liquidly move between them. In this view, structural diversity is a resource to be fostered and protected, not homogenized away for bureaucratic convenience, or the false god of efficiency. What we need is a diverse range of very different environments, expressing a wide range of powerful ideas about how to support discovery. In some sense, the range of available environments is a reflection of our collective metascientific intelligence. And monoculture is the enemy of creative work.

Today, structural diversity is low: scientific research is overwhelmingly performed in universities by professor-led research groups. These groups typically contain from 5 to 30 people, have at most two additional administrators beyond the principal investigator (i.e. the professor), and are composed of undergrads, graduate students, and postdocs. (There are, of course, many exceptions to the template I’ve outlined above.)

This structure influences the sort of scientific work that gets done. To graduate, PhD students need to have papers, which means that they need to work on projects that have sufficiently short time horizons to conclude before they graduate. Additionally, they need to have first-author papers, which means that they can’t work in large teams; if three students work together, they can’t all be first authors. Taken together, these considerations imply that most projects should take 10 person-years or less to accomplish.¹

This is a long time, but not that long for science: unfortunately, most truly groundbreaking projects are “too big” for a single academic lab. Conversely, students are incentivized to publish in high-impact journals, and so projects that are “too small” are penalized for not being ambitious enough. Skilled academics are able to thread the needle between projects that are “too big” and projects that are “too small” and provide exactly the right amount of effort to generate high-impact publications within a reasonable time horizon.

These same challenges are echoed (on grand scale) for assistant professors. New faculty typically have 5 to 7 years before they must submit their tenure package, which means they’re forced to choose projects likely to work in that time frame (with inexperienced students and relatively few resources, no less). This disincentives tool-building, which generally chews up too much time to be an efficient use of resources for young labs, and puts a ceiling on their ambition.

These aren’t the only consequences of academia’s structure. “Deep” skills requiring more than a year or two of training are tricky, because even PhD students are only there for 4–6 years, so the time it takes to acquire skills comes directly out of the time they can be doing productive research. Additionally, certain skill sets (e.g. software engineering) command such a premium that it’s difficult to attract such people to academia. Specialized instrumentation is another challenge: a given lab might only have the budget for a few high-end instruments, implying that its projects must be chosen accordingly.

A defender of the status quo might reasonably respond that smaller labs do lead to smaller projects, but in a greater number of areas: “what is any ocean, but a multitude of drops?” The academic structure, with its incentives for demonstrable progress, certainly cuts back on the number of costly, high-profile failures: most failed research groups never get tenure, limiting the damage.

Nevertheless, it seems that at this point many fields have been picked clean of projects with low enough barriers to entry to make them accessible to academics. Many remaining insights, including those needed to spawn new fields of science, may be simply out of reach of a decentralized array of small, independent academic groups. As Nielsen and Qiu put it, “you can stack up as many canonical researchers as you like and they still won't do the non-canonical work; it's a bottleneck on our capacity for discovery.” To support this point, they cite examples where large organizations were able to produce big advances inaccessible to their smaller counterparts: LIGO, the Large Hadron Collider, Brian Nosek’s Center for Open Science, and the Human Genome Project.

If we accept the claim that structural diversity is important, we ought to look for opportunities to expand structural diversity wherever possible. At the margin, this might look like supporting non-traditional hires in academic groups, including people who don’t fit into the established undergrad–graduate student–postdoc pipeline, and allowing for greater flexibility in the structure of labs (i.e. multiple professors within the same lab). More radical solutions might look like scientific start-ups where profitability can realistically be achieved, or “focused research organizations” where it cannot.² What could a well-managed team of professional scientists, focused solely on work “not easily possible in existing environments,” accomplish when pitted against some of the toughest unsolved problems in the world? We won’t know until we try.

Thanks to Ari Wagen for reading a draft of this post.

Footnotes

It’s true that there are a lot of efforts to create larger supra-lab organizations to tackle big questions: in chemistry, the NSF has funded “centers for chemical innovation” to unite like-minded researchers. But trying to forge a functional organization from a myriad of independent sovereign teams seems much harder than simply starting a new organization de novo.
What if the discoveries needed to advance science are too big for these proposed solutions? For instance, it’s tough to imagine funding a Manhattan Project-style endeavor this way. I don’t think that it’s the case that science requires government-scale resources to push past stagnation, but if that were really true, it might be an argument for using the full might of state capacity to drive research, like something out of Liu Cixin’s novels. Note, however, that this would make the task of “picking the right problems” that much more important.

Omelas, Hirschman, Altom: On Healing Academia

October 18, 2022

Spoilers below for Ursula Le Guin’s short story “The Ones Who Walk Away From Omelas.” If you haven’t read it, it’s short—go and do so now!

TW: child abuse, suicide.

In her short story “The Ones Who Walk Away From Omelas,” Ursula Le Guin describes an idyllic town (Omelas) built entirely on the misery of a single, innocent child. The inhabitants of Omelas lead a utopian life, but are burdened with the knowledge that this child suffers so that they can prosper. Although the story is brief—only five pages long—Le Guin pulls no punches in the emotional weight of her writing:

The child, who has not always lived in the tool room, and can remember sunlight and its mother's voice, sometimes speaks. "I will be good, " it says. "Please let me out. I will be good!" They never answer. The child used to scream for help at night, and cry a good deal, but now it only makes a kind of whining, "eh-haa, eh-haa," and it speaks less and less often. It is so thin there are no calves to its legs; its belly protrudes; it lives on a half-bowl of corn meal and grease a day.

The metaphor, and underlying social commentary, is perhaps obvious. Le Guin goes on to (implicitly) attack utilitarians:

[The spectators] would like to do something for the child. But there is nothing they can do. If the child were brought up into the sunlight out of that vile place, if it were cleaned and fed and comforted, that would be a good thing, indeed; but if it were done, in that day and hour all the prosperity and beauty and delight of Omelas would wither and be destroyed. Those are the terms. To exchange all the goodness and grace of every life in Omelas for that single, small improvement: to throw away the happiness of thousands for the chance of happiness of one: that would be to let guilt within the walls indeed.

And those who are good at rationalizing away injustice:

…as time goes on [the people] begin to realize that even if the child could be released, it would not get much good of its freedom: a little vague pleasure of warmth and food, no real doubt, but little more. It is too degraded and imbecile to know any real joy. It has been afraid too long ever to be free of fear. Its habits are too uncouth for it to respond to humane treatment. Indeed, after so long it would probably be wretched without walls about it to protect it, and darkness for its eyes, and its own excrement to sit in. Their tears at the bitter injustice dry when they begin to perceive the terrible justice of reality, and to accept it.

And those who think that suffering “gives life meaning”:

[The people] know that they, like the child, are not free. They know compassion. It is the existence of the child, and their knowledge of its existence, that makes possible the nobility of their architecture, the poignancy of their music, the profundity of their science. It is because of the child that they are so gentle with children. They know that if the wretched one were not there sniveling in the dark, the other one, the flute-player, could make no joyful music as the young riders line up in their beauty for the race in the sunlight of the first morning of summer.

The story concludes by describing the last, and rarest, response to Omelas:

At times one of the adolescent girls or boys who go see the child does not go home to weep or rage, does not, in fact, go home at all. Sometimes also a man or a woman much older falls silent for a day or two, then leaves home.... They leave Omelas, they walk ahead into the darkness, and they do not come back. The place they go towards is a place even less imaginable to most of us than the city of happiness. I cannot describe it at all. It is possible that it does not exist. But they seem to know where they are going, the ones who walk away from Omelas.

As I read the story, the conclusion is that few people have the moral courage to reject injustice entirely. Rejecting a broken system is scary, and risky; most people would rather lull themselves into complacency than try and build a better world. But implicit in this analysis is that the ones who walk away are making the right decision, and the ones who stay in Omelas are making the wrong one. In our own flawed world, when is this true?

The past decades have seen a great purge within American culture. The “Me Too” movement exposed the prevalence of sexual assualt within Hollywood, Boston Globe investigations revealed massive corruption within the Catholic Church, vast protests over the summer of 2020 decried systematic racism within American institutions, and in general institutions of all forms have come under attack for their failings. Every university has a racist legacy to reckon with; every business has an investor with unsavory political beliefs.

To some, this is the oft-derided “cancel culture”—finding fault with everyone, and viciously attacking people and organizations for even the slightest offenses. If “all have sinned and fall short,” then evil can never truly be eradicated, and all that this movement can do is destroy venerable institutions without constructing anything better. To others, however, making peace with evil is deplorable. Evil is the original pandemic; its capacity to spread is unparalleled, and its damage immeasurable. No compromise can be made with the enemy; no peace can be made with injustice.

Neither heuristic is sufficient; as usual, wisdom resides in the dialectic. For everyone who seeks to improve the world, then, the question presents itself anew: can one work within the system, flaws and all, or must change come from outside?

* * *

Albert Hirschman analyzed these issues in his 1970 book Exit, Voice, and Loyalty, which analyzes how consumers respond to declining quality. Hirschman draws a basic distinction between “exit,” the movement of individuals away from their current allegiance and towards a competitor, and “voice,” when individuals remain loyal and protest the change from within the system. An insightful point that Hirschman makes is that different fields of study view these options in different ways. Economists, who often think in terms of “perfect competition,” tend to assume exit is the most meaningful option, whereas political scientists prefer voice (i.e. engagement with the political process), thinking of exit as craven or traitorous.

These two options can be described in other terms. Exit is the mindset of the young reformer, the naïve, the idealistic, who believes the old system is beyond saving and a new world must be birthed. In contrast, voice is the perspective of small-c conservatism and Chesterton’s Fence, the wisdom of the agéd seer who has seen countless evils and knows how fragile civilization can be. This dichotomy does not separate Red from Blue; a love of exit unites Robespierre, Thunberg, and Trump, while voice embraces Henry Clay and Deng Xiaopeng alike.

When is exit better, and when is voice better? In certain circumstances exit can preclude voice by allowing only the most motivated and skilled members to escape a failing system. This removes the very elements of the populace necessary for change through voice, resulting in what Hirschman calls “an oppression of the weak by the incompetent and an exploitation of the poor by the lazy.” In this scenario, then, voice is superior. Exit also only functions when there are legitimate alternatives to the current system: in a monopoly, there can be no exit.

(Different Christian sects present an interesting case study here. In Catholicism, the Church is one united organization, and so exit is not an option: only voice is permitted. In contrast, Protestants are divided and subdivided into innumerable organizations, and so any individual Protestant can easily “vote with their feet” and join a church they agree with. The myriad failures of both sects suggests that neither solution is perfect.)

In contrast, voice presumes that change is possible—that the system is able to be reformed, and that doing so is more effective than simply starting over. For governments or Catholics, this may be true; for smaller organizations, it seems less true. An idealized capitalist market proceeds through relentless “creative destruction”: old firms become stagnant and stop innovating and are replaced by young startups, who last a few decades before themselves becoming stagnant. Creating the next Google, in most cases, is easier than fixing Google.

In other cases, the failures of the current system are so all-encompassing, so total, that it’s almost impossible to conceptualize the right reforms from within. Scott Alexander (of Astral Codex Ten, née Slate Star Codex) describes this phenomenon in his piece “Against Tulip Subsidies,” which critiques the discourse around rising tuition costs. Reforming how we pay college tuition, Alexander argues, “would subsidize the continuation of a useless tradition that has turned into a speculation bubble” and thus entrench the very thing we ought to uproot. Voice is better suited for marginal or incremental change; if you want to think big, start from scratch.

The choice between voice and exit thus hinges on several factors: how hard would it be to fix the current system, and how hard would it be to build a new and better system from the ground up?

* * *

As a high school student learning about synthetic organic chemistry, I came across a New York Times article entitled “Lethal Chemistry At Harvard.” The article told the story of Jason Altom, a brilliant Ph.D. candidate in organic chemistry who took his life after years of struggling on a challenging total synthesis problem. Altom’s story resonated with me so much that, as an undergraduate, I printed out his aspidophytine synthesis and pinned it above my desk as a sort of impromptu memento mori. (The synthesis, and Scheme 4 in particular, is transcendently beautiful.) He died just weeks after I was born; now I work in the same building in which he worked.

Altom’s story is by no means unique: according to the Chronicle of Higher Education, he was one of three graduate students in his lab to commit suicide within an 11-year period, with more suicides occuring in other labs, departments, and universities. Although some of the issues highlighted by his suicide have been addressed (I have an advising committee now in addition to my professor), the core reality remains the same: the success or failure of a given graduate student depends almost completely on their relationship with their advisor, making students willing to sacrifice almost anything to please their PI. Even those who “succeed” often emerge damaged by the process, aspirations lowered and ambition quenched.

Nor are graduate-student “deaths of despair” the only problem within academia. Much has been written about the failures of the scientific establishment: problems with fraud and p-hacking, parasitic university rent-seeking, problems with peer review, a general increase in bureaucracy and concomitant decline in research productivity, a failure to fund basic methods, and many more. It’s obvious that the ivory tower is not without blemish. The human cost, typified by Altom, is but the most visible apex of a whole mountain of problems.

Nevertheless, although Le Guin might not approve, academia can be defended through a utilitarian argument: the considerable benefits of academic research and education outweigh the human costs and other drawbacks. This argument, while compelling, presumes that the functions of graduate school are irreplaceable—that no better alternative can be created, that exit is impossible. I would also argue that, although academic research is good, it’s likely it could be improved and made better. Indeed, the very importance of research makes it uniquely susceptible to decline; the “ceiling” of scientific progress is so high that even considerable atrophy still leaves behind an important and productive institution. Hirschman again:

The wide latitude humans societies have for deterioration is the inevitable counterpart of man’s increasing productivity and control over his environment. Occasional decline as well as prolonged mediocrity—in relation to achievable performance levels—must be counted among the many penalties of progress.

If we accept that improvements to the “prolonged mediocrity” of the academic research system are necessary, the question becomes familiar: can the system be fixed from within (voice), or would it be easier to start from scratch (exit)?

Reforming any huge system is non-trivial, but academia presents special difficulties; as Hirschman describes, academia protects itself by selecting for those who can stomach its evils. The conscientious and compassionate flee, for the most part, to undergraduate institutions or industrial positions, enriching R1 faculty positions in those less encumbered by moral concerns. (This statement notwithstanding, I don’t mean to condemn faculty en masse; I have the utmost love and admiration for many faculty members I know who push back against the excesses of the system.) This selection effect deprives voice of its strongest assets, making reform that much harder.

Exit, too, is not without its challenges. The problems we face cannot be solved simply by moving to a different university or a different country; the interconnectedness of academia makes it monolithic. To contemplate exit from the academic system means a total revolution, erasing centuries of norms and paradigms—the relationship between professors and graduate students that Altom blamed for his death, which echoes the ancient master–apprentice relationship, dates back to Justus von Liebig in 19th-century Germany. Exit means new journals, new funding paradigms, new ways to recruit students and new ways to train them. Proper regime change leaves no stone unturned.

Perhaps it's the optimism of youth, or a misguided belief in the possibility of progress. But day by day I find myself believing less in voice and more in exit. The past few years have seen a flourishing of non-academic models for science. New ventures like Arcadia, the Arc Institute, New Science, the Astera Institute, and the more venerable Santa Fe Institute are demonstrating that science can be done outside a university—and the growing interest from industry and philanthropists for funding basic research indicates that the demand is there. But at the same time, these little endeavors all combined represent less than a percent of the current governmental–academic complex. Time will tell, too, if the flaws of our current system are fated to be recapitulated anew in any replacement.

Will it be possible to build a robust and practical scientific ecosystem in parallel to the current system, gradually moving more and more pieces into position until at last Uqbar becomes reality? I don’t know—but I’m excited to find out.

Thanks to Eugene Kwan, Ari Wagen, and Taylor Wagen for reading a draft of this piece.

Entropy in the Water Dimer

October 6, 2022

The failure of conventional calculations to handle entropy is well-documented. Entropy, which fundamentally depends on the number of microstates accesible to a system, is challenging to describe in terms of a single set of XYZ coordinates (i.e. a single microstate), and naïve approaches to computation simply disregard this important consideration.

Most programs get around this problem by partitioning entropy into various components—translational, rotational, vibrational, and configurational—and handling each of these separately. For many systems, conventional approximations perform well. Translational and rotational entropy depend in predictable ways on the mass and moment of inertia of the molecule, and vibrational entropy can be estimated from normal-mode analysis at stationary points. Conformational entropy is less easily automated and as a result is often neglected in the literature (see the discussion in the SI), although some recent publications are changing that.

In general, however, the approximations above only work for ground states. To quote the Gaussian vibrational analysis white paper:

Vibrational analysis, as it’s descibed in most texts and implemented in Gaussian, is valid only when the first derivatives of the energy with respect to displacement of the atoms are zero. In other words, the geometry used for vibrational analysis must be optimized at the same level of theory and with the same basis set that the second derivatives were generated with. Analysis at transition states and higher order saddle points is also valid. Other geometries are not valid.

While this isn't a huge issue in most cases, since most processes are associated with a minima or first-order saddle point on the electronic energy surface, it can become a big deal for reactions where entropy significantly shifts the position of the transition state (e.g. Figure 4 in this study of cycloadditions). Even worse, however, are cases where entropy constitutes the entire driving force for the reaction: association/dissociation processes. In his elegant review of various failures of computational modelling, Mookie Baik illustrated this point by showing that no transition state could be found for dissociation of the water dimer in the gas phase:

Panel (b) of this figure shows the electronic energy surface for dissociation, which monotonically increases out to infinity—there's never a maximum, and so there's no transition state. To estimate the position of the transition state, Baik proposes computing the entropy (using the above stationary-point approximations) at the first few points, where the two molecules are still tightly bound, and then extrapolating the curve into a sigmoid function. Combining the two surfaces then yields a nice-looking (if noisy) curve with a clear transition state at an O–H distance of around 3 Å.

This approach, while clever, seems theoretically a bit dubious—is it guaranteed that entropy must always follow a smooth sigmoidal interpolation between bound and unbound forms? I thought that a more direct solution to the entropy problem would take advantage of ab initio molecular dynamics. While too slow for most systems, AIMD intrinsically accounts for entropy and thus should be able to generate considerably more accurate energy surfaces for association/dissociation events.

Using presto, I ran 36 constrained 100 ps NVT simulations of the water dimer with different O–O distances, and used the weighted histogram analysis method to stitch them together into a final potential energy surface. I then compared these results to those obtained from a direct opt=modredundant calculation (with frequencies at every point) from Gaussian. (All calculations were performed in Gaussian 16 at the wB97XD/6-31G(d) level of theory, which overbinds the water dimer a bit owing to basis-set superposition error.)

The results are shown below (error bars from the AIMD simulation are derived from bootstrapping):

Comparison of different approaches for studying dissociation of the water dimer.

As expected, no transition state can be seen on the black line corresponding to the electronic energy surface, or on the green line corresponding to enthalpy. All methods that depend on normal-mode analysis show sizeable variation at non-stationary points, which is perhaps unsurprising. What was more surprising was how much conventional DFT calculations (purple) overestimated entropy relative to AIMD! Correcting for low-frequency vibrational modes brought the DFT values more into line with AIMD, but a sizeable discrepancy persists.

Also surprising is how different the AIMD free energy surface looks from Baik's estimated free energy surface. Although the scales are clearly different (I used O–O distance for the X axis, whereas Baik used O–H distance), the absence of a sharp maximum in both the AIMD data and the corrected Gibbs free energy data from conventional DFT is striking. Is this another instance of entropy–enthalpy compensation?

In the absence of good gas-phase estimates of the free energy surface, it's tough to know how far the AIMD curve is from the true values; perhaps others more skilled in these issues can propose higher-level approaches or suggest additional sources of error. Still, on the meta-level, this case study demonstrates how molecular dynamics holds promise for modelling things that just can't be modelled other ways. Although this approach is still too expensive for medium to large systems, it's exciting to imagine what might be possible in 5 or 10 years!

Thanks to Richard Liu and Eugene Kwan for insightful discussions about these issues.