Blog

Quantum Computing for Quantum Chemistry: Short-Term Pessimism

October 27, 2023

Quantum computing gets a lot of attention these days. In this post, I want to examine the application of quantum computing to quantum chemistry, with a focus on determining whether there are any business-viable applications today. My conclusion is that while quantum computing is a very exciting scientific direction for chemistry, it’s still very much a realm where basic research and development is needed, and it’s not yet ready for substantial commercial attention.

Briefly, for those unaware, quantum computing revolves around “qubits” (Biblical pun intended?), quantum analogs of regular bits. They can be in the spin-up or spin-down states, much like bits can hold a 0 or a 1, but they also exhibit quantum behavior like superposition and entanglement.

Algorithms which run on quantum computers can exhibit “quantum advantage,” where for a given problem the quantum algorithm scales better than the classical algorithm, or “quantum supremacy,” where the quantum algorithm is able to tackle problems inaccessible to classical computers. Perhaps the best-known example of this is Shor’s algorithm, which enables integer factorization in polynomial time (in comparison to the fastest classical algorithm, which is sub-exponential).

It’s pretty tough to actually make quantum computers in the real world, though. There are many different strategies for what to make qubits out of: isolated atoms, nitrogen vacancy centers in diamonds, superconductors, and trapped ions have all been proposed. The limited number of qubits accessible by state-of-the-art quantum computers, along with the high error rate and short decoherence times, means that practical quantum computation is very challenging today. These challenges are collectively described as “noisy intermediate-scale quantum”, or NISQ, the world we currently live in. Much effort has gone into trying to find NISQ-compatible algorithms.

Quantum chemistry, which revolves around simulating a quantum system (nuclei and electrons), seems like an ideal candidate for quantum computing. And indeed, many people have proposed using quantum computers for quantum chemistry, even going so far as to call chemistry the “killer app” for quantum computation.

Here are a few representative claims:

“Few fields will get value from quantum computing as quickly as chemistry. Even today’s supercomputers struggle to model a single molecule in its full complexity. We study algorithms designed to do what those machines can’t, and power a new era of discovery in chemistry, materials, and medicine.” (IBM)
“The problem is that most quantum chemical problems scale exponentially with system size. And classical computers struggle to cope with this exponential scaling. Realistically, they will never enable quantum chemistry to tackle real-world systems.” (EMD Group)
“Classically built computers simply cannot handle the level of complexity of substances as commonplace as caffeine… But if future chemists embrace quantum computers, they are likely to be a lot luckier.” (Scientific American)

None of these claims are technically incorrect—there is a level of “full complexity” to caffeine which we cannot model today—but most of them are very misleading. Computational chemistry is doing just fine as a field without quantum computers; I don’t think there are any deep scientific questions about the nature of caffeine that depend on computing its exact electronic structure to the microHartree (competitions between physical chemists notwithstanding).

(Some other claims about quantum computing and chemistry border on the ridiculous: I’m not sure what to take away from this D-Wave press release which claims that their quantum computer can model 67 million solutions to the problem of “forever chemicals” in 13 seconds. Dulwich Quantum Computing, on Twitter/X, does an excellent job of cataloging such malfeasances.)

Nevertheless, there are many legitimate and exciting applications of quantum computing to chemistry. Perhaps the best-known is the variational quantum eigensolver (VQE), developed by Alán Aspuru-Guzik and co-workers in 2014. The VQE is a hybrid quantum/classical algorithm suitable for the NISQ era: it takes a Hartree–Fock calculation as the starting point, and then minimizes the energy by optimizing the system classically while evaluating the energy with a quantum computer. (If you want to learn more, there are a number of easy-to-read introductions to the VQE: here’s one from Joshua Goings, and here’s another from Pennylane.)

Another approach, more suitable for fault-tolerant quantum computers with large numbers of qubits, is quantum phase estimation. Quantum phase estimation, explained nicely by Pennylane here, works like this: given a unitary operator and a state, the state is projected into an eigenstate and the corresponding eigenvalue is returned. (It’s not just projected onto an eigenstate randomly; the probability of returning a given eigenstate is proportional to the overlap with the input state.) This might sound abstract, but the ground-state energy of a molecule is just the smallest eigenvalue of its Hamiltonian, so this provides a route to get exact ground-state energies, assuming we can generate good enough initial states (again, typically a Hartree–Fock calculations).

Both of these methods are pretty exciting, since full configuration interaction (the “correct” classical way to get the exact ground-state energy) typically has an O(N!) cost, making it prohibitively expensive for anything larger than, like, N₂. Further work has built on these ideas: I don’t have the time or skillset to provide a full review of the field, although I’ll note this work from Head-Gordon & friends and this work from Joonho Lee. (These reviews provide an excellent overview of different algorithms; I’ll discuss it later on.)

Based on the above description, one might reasonably assume that quantum computers offer some sort of dramatic quantum advantage relative to their classic congeners. Recent work from Garnet Chan (and many coworkers) challenges this assumption, though:

…we do not find evidence for the exponential scaling of classical heuristics in a set of relevant problems. …our results suggest that without new and fundamental insights, there may be a lack of generic EQA [exponential quantum advantage] in this task. Identifying a relevant quantum chemical system with strong evidence of EQA remains an open question.

The authors make many interesting points. In particular, they point out that physical systems seem to exhibit locality, i.e. if we’re trying to describe some system embedded in a larger environment to a given accuracy, then there’s some distance beyond which we can ignore the larger environment. This means that there are almost certainly polynomial-time classical algorithms out there for all of computational chemistry, since at some point increasing system size won’t slow our computations down any more.

This might sound abstract, but the authors point out that coupled-cluster theory, which can (in principle) be extended to arbitrary levels of precision, can be made to take advantage of locality and scale linearly with increasing system size or increasing levels of accuracy. Although such algorithms aren’t known for strongly correlated systems, like metallic systems, Chan and co-workers argue based on analogy to strongly correlated model systems that analogous behavior can be expected.

Figure 3, showing linear scaling of coupled-cluster theory with respect to increasing accuracy (A) and increasing system size (B)

The above paper is making a very specific point—that exponential quantum advantage is unlikely—but doesn’t address whether weaker versions of quantum advantage are likely. Could it still be the case that quantum algorithms exhibit polynomial quantum advantage, e.g. scaling as O(N) while classical algorithms scale as O(N²)?

Another recent paper, from scientists at Google and QSimulate, addresses this question by looking at the electronic structure of various iron complexes derived from cytochrome P450. They find that there’s some evidence that quantum computers (using quantum phase estimation) will be able to outcompete the best classical methods today (CCSD(T) and DMRG), but it’ll take a really big quantum computer:

Most notably, under realistic hardware configurations we predict that the largest models of CYP can be simulated with under 100 h of quantum computer time using approximately 5 million qubits implementing 7.8 × 10⁹ Toffoli gates using four T factories. A direct runtime comparison of qubitized phase estimation shows a more favorable scaling than DMRG, in terms of bond dimension, and indicates future devices can potentially outperform classical machines when computing ground-state energies. Extrapolating the observed resource estimates to the full Cpd I system and compiling to the surface code indicate that a direct simulation of the entire system could require 1.5 trillion Toffoli gates—an unfeasible number of Toffoli gates to perform.

(A Toffoli gate is a three-qubit operator, described nicely here.)

Given that the largest quantum computer yet built is 433 qubits, it’s clear that there’s a lot of work left to do until we can use quantum computers to inaugurate “a new era of discovery in chemistry.”

433 qubits down, only 8 billion more to go

A recent review agrees with this assessment: the authors write that “there is currently no evidence that heuristic NISQ approaches [like VQE] will be able to scale to large system sizes and provide advantage over classical methods,” and conclude with this paragraph:

Solving the electronic structure problem has repeatedly been identified as one of the most promising applications for quantum computers. Nevertheless, the discussion above highlights a number of challenges for current quantum approaches to become practical. Most notably, after accounting for the approximations typically made (i.e. incorporating the cost of initial state preparation, using nonminimal basis sets, including repetitions for correctness checking and sampling a range of parameters), a large number of logical qubits and total T/Toffoli gates are required. A major difficulty is that, unlike problems such as factoring, the end-to-end electronic structure problem typically requires solving a large number of closely related problem instances.

An important thing to note, which the above paragraph alludes to, is that the specific quantum algorithms discussed here don't actually make quantum chemistry faster than today’s methods—they typically rely on a Hartree–Fock ansatz, which is about the same amount of work as a DFT calculation. Since it's likely that proper treatment of electron correlation will require a sizable basis set, much like we see with coupled-cluster theory, we can presume that quantum methods would be slower than most DFT methods (even assuming that the actual quantum part of the calculation could be run instantly).

This ignores the fact that the quantum methods would of course give much better results—but an uncomfortable truth is that, unlike one might think from the exuberant press releases quoted above, classical algorithms generally do an exceptional job already. Most molecules are very simple from an electronic structure perspective: static electron correlation is pretty rare, and linear scaling CCSD(T) approaches are widely available and very effective (e.g.). There’s simply no need for FCI-quality results for most chemical problems, random exceptions notwithstanding.

(Aspuru-Guzik and co-workers agree; in a 2020 review, they state that they “do not expect [HF and DFT] calculations to be replaced by those on quantum computers, given the large system sizes that are simulated,” suggesting instead that quantum computers might find utility for statically correlated systems with 100+ spin orbitals)

A related point I made in a recent essay/white paper for Rowan is that quantum chemistry, at least as it’s applied to drug discovery, is limited not by accuracy but by speed. Existing quantum chemistry methods are already far more accurate than state-of-the-art drug discovery methods; replacing them with quantum computing-based approaches is like worrying about whether to bring a Lamborghini or a Formula 1 car to a go-kart race. It’s almost certain that there’s some way that “perfect” electronic structure calculations could be useful in drug design, but it’s hardly trivial to figure out how to turn a bunch of VQE calculations into a clinical candidate.

Other fields, like materials science, seem to be more limited by inaccuracies in theory—modeling metals and surfaces is really hard—but the Hartree–Fock ansatz is also hard here, and there are fewer commercial precedents for computational chemistry in general. To my knowledge, the Hartree–Fock starting point alone is a terrific challenge for a system like e.g. a cube of 10,000 metal atoms, which is why so many materials scientists avoid exact exchange and stick to local functionals. (I don't know much about computations on periodic systems, though, so correct me if this is wrong!) Using quantum computing to design superconducting materials probably won’t be as easy as it seems on Twitter/X.

So, while quantum computing is a terrifically exciting direction for computational chemistry in a scientific sense, I’m not sure it’s yet investable in a business sense. I don’t mean to belittle all the great scientific work being done in this field, in the papers I’ve referenced above and in many others. The point I’m trying to make here—that this field isn’t mature enough for actual commercial utility—could just as easily be made about ML in the 2000s, or any other number of promising but pre-commercial technologies.

I’ll close by noting that it seems like markets are coming around to this perspective, too. Zapata Computing, one of the original “quantum computing for chemistry” companies, recently pivoted to… generative AI, going public via a SPAC with Andretti (motorsport), and IonQ recently parted ways with its CSO, who is going back to his faculty job at Duke. We’ll see what happens, but progress in hardware has been slow, and it’s likely that it’ll be years yet until we can start to perform practical quantum chemical calculations on quantum computers.

Organic Chemistry’s Wish List, Four Years Later

October 20, 2023

In 2019, ChemistryWorld published a “wish list” of reactions for organic chemistry, describing five hypothetical reactions which were particularly desirable for medicinal chemistry. A few recent papers brought this back to my mind, so I revisited the list with the aim of seeing what progress had been made. (Note that I am judging these based solely by memory, and accordingly I will certainly omit work that I ought to know about—sorry!)

Fluorination

1. Fluorination – Exchanging a specific hydrogen for a fluorine atom in molecules with many functional groups. A reaction that installs a difluoromethyl group would be nice too.

This is still hard! To my knowledge, no progress has really been made towards this goal in a general sense (although plenty of isolated fluorination reactions are still reported, many of which are useful).

C–H fluorination is particularly challenging because separating C–H and C–F compounds can be quite difficult. (Fluorine is often considered a hydrogen bioisostere, which is nice from a design perspective but annoying from a chromatography perspective.) For this reason, I’m more optimistic about methods that go through separable intermediates than the article’s author: “installing another reactive group… and exchanging it for fluorine” may not be particularly ideal in the Baran sense, but my guess is that this strategy will be more fruitful than direct C–H fluorination for a long while yet.

Heteroatom Alkylation

2. Heteroatom alkylation – A reaction that – selectively – attaches an alkyl group onto one heteroatom in rings that have several, such as pyrazoles, triazoles and pyridones.

This problem is still unsolved. Lloyd-Jones published some nice work on triazole alkylation a few weeks after the ChemistryWorld article came out, but otherwise it doesn’t seem like this is a problem that people in academia are thinking much about.

Unlike some of the others, this challenge seems ideally suited to organocatalysis, so maybe someone else in that subfield will start working on it. (Our work on site-selective glycosylation might be relevant?)

EDIT: I missed extremely relevant work from Stephan Hammer, which uses engineered enzymes to alkylate pyrazoles with haloalkanes (and cites the ChemistryWorld article directly). Sorry!

Csp³ Coupling

3. Carbon coupling – A reaction as robust and versatile as traditional cross coupling for stitching together aliphatic carbon atoms – ideally with control of chirality, too. Chemists also want more options for the kinds of molecules they can use as coupling precursors.

There’s been a ton of work on Csp³ cross coupling since this article came out: MacMillan (1, 2, 3, 4, 5, 6) and Baran (1, 2, 3) have published a lot of papers, and plenty of other labs are also working here (I can’t list everyone, but I’ll highlight this work from Sevov). I doubt this can be considered “solved” yet, but certainly things are much closer than they were in 2019.

(I haven’t seen much work on enantioselective variants, though: this 2016 paper and paper #2 from Baran above are the only ones that comes to mind, although I’m sure I’m missing something. Still—an opportunity!)

Reactions of Heterocycles

4. Making and modifying heterocycles – A reaction to install functional groups – from alkyl to halogen – anywhere on aromatic and aliphatic heterocycles, such as pyridine, piperidine or isoxazole. Reactions that can make completely new heterocycles from scratch would be a bonus.

I’m not a big fan of the way this goal is written—virtually every structure in medicinal chemistry has a heterocycle, so “making and modifying heterocycles” is just too vague. What would a general solution even look like?

Nevertheless, there are plenty of recent papers which address this sort of problem. Some of my favorites are:

Aaron Sather’s work making N-aryl piperidines (1, 2)
Basically all of Andy McNally’s work
Some nice reagent design from Patrick Fier, which basically improves on the Chichibabin reaction

(One of my friends in academia told me that they really disliked the Sather work because it was just classic reactivity used in a straightforward way, i.e. not daring enough. What a clear illustration of misaligned incentives!)

Atom Swapping/Skeletal Editing

5. Atom swapping – A reaction that can exchange individual atoms selectively, like swapping a carbon for a nitrogen atom in a ring. This chemical version of gene editing could revolutionise drug discovery, but is probably furthest from realisation.

Ironically, this goal is probably the one that’s closest to realization today (or perhaps #3): Noah Burns and Mark Levin have both published papers converting benzene rings directly to pyridines recently. More broadly, lots of organic chemists are getting interested in “skeletal editing” (i.e. modifying the skeleton of a molecule, not the periphery), which seems like exactly what this goal is describing. To my knowledge, a comprehensive review has not yet been published, but this article gives a pretty good overview of the area.

Overall, it’s impressive how much progress has been made towards the goals enumerated in the original article, given that it’s only been four years (less than the average length of a PhD!). Organic methodology is a very exciting field right now: data is easy to acquire, there are lots of problems to work on, and there seems to be genuine interest from adjacent fields about the technologies being developed. Still, if the toughest challenges in the field’s imagination can be solved in under a decade, it makes you wonder what organic methodology will look like in 20–30 years.

As methods get faster to develop and more and more methods are published, what will happen? Will chemists employ an exponentially growing arsenal of transformations in their syntheses, or will the same methods continually be forgotten and rediscovered every few decades? Will computers be able to sift through centuries of literature and build the perfect synthesis—or will the rise of automation mean that we have to redesign every reaction to be “dump and stir”? Or will biocatalysis just render this entire field obsolete? The exact nature of synthesis’s eschatology remains to be determined.

Networking, For Skeptics

October 16, 2023

"Hi, would you like to connect on LinkedIn?"

(in the spirit of Dale Carnegie and post-rat etiquette guides)

Scientists, engineers, and other technical people often make fun of networking. Until a few years ago, I did this too: I thought networking was some dumb activity done by business students who didn’t have actual work to do, or something exploitative focused on pure self-advancement. But over the past year or so, I’ve learned why networking is important, and have found a way to network that doesn’t make me feel uncomfortable or selfish. I want to share my current thoughts here, in case they help anyone else.

The first thing to recognize is that networking matters because we live in a world of relationships. Technical people often struggle with this point: to some, relying on relationships feels imprecise or even nepotistic. But we’re human beings, not stateless automata communicating via some protocol, and it’s inevitable (and appropriate) for us to form relationships and care about them.

Having the right relationships can make a big difference. We trust people we know much more than we trust strangers. It’s weird for someone whom you’ve never met to email you asking for a favor, but very normal for a friend or acquaintance to reach out and ask for something. And most ambitious projects (academia, startups, etc) are limited not by money but by human capital: there are only so many talented people out there, and if you can’t get access to them, what you’re doing will suffer. (On a macro level, this explains why management consulting has become so important.)

So it’s worth intentionally building relationships before you have an immediate need. There are a lot of people,^{[citation needed]} so how might one go about this? One obvious strategy might be to build relationships with people you think could be useful to you. But this doesn’t work very well. It’s not always obvious what will or won’t be useful in the future, and far too easy to let status quo bias reign supreme. (Most graduate students struggle to imagine how knowing someone from another subfield could ever be useful, let alone someone who isn’t a scientist, which makes it tough when they want to do something outside academia.)

Another downside to this strategy is that you have to partition people into “useful” or “not useful” upon meeting them. This is self-defeating: most people can figure out when you’re treating them only as a means to an end, so walking around evaluating everyone’s utility tends to poison your interactions. Plus, it’s a very Machiavellian way to view the world, and ought to make you feel a little gross.

Instead, a better strategy is to accept that you won’t be able to predict a priori who will be useful and instead just try and meet people. If the end goal of networking is to find people you’ll be able to collaborate with in the future, in one capacity or another, then it’s important to find people who share your values and who you get along with: in other words, friends. So, rather than worrying about if it’ll be better to know a renewable energy consultant or a paralegal for a medical device company, you can just see who you like spending time with and go from there.

If we think of networking as synonymous to making friends, then it also becomes more obvious when and how one should network. Anything explicitly framed as an opportunity for networking is a bad choice: these events tend to attract people who are self-centered, and mostly end up revolving around LinkedIn (the Tinder of networking?). Instead, look for places where you’ll find people you could be friends with. For me, this ends up mostly being church and church-adjacent spaces like Bible studies; I’m not sure what the analogous space for non-religious people is.

The strategies I’ve discussed above are framed in terms of “demand-side networking,” or how you can find ways to acquire human capital when you have a demand for it. But the same considerations apply to “supply-side networking,” or marketing oneself to potential people. The beauty of treating networking simply as making friends is that you’re not committing to any particular outcome: maybe you’ll benefit from it, maybe the other person will benefit from it, or maybe both of you will (or neither). The expected value of building new relationships should always be positive, which means that networking isn’t a zero-sum game: it’s good for all involved.

The conclusion I want to leave you with is this: networking, rightly understood, just means living in a way that recognizes our dependence on other people. To live life as a “networker” means putting yourself in places where you might make new friends, looking for common ground in all your interactions, and trying to recognize others’ talents and abilities. Networking isn’t some Randian pursuit focused on extracting value from those around us. It should be done with humility, accepting that we need other people to thrive and being open to whatever relationships come our way.

Thanks to Ari Wagen for editing drafts of this post.

Composite Methods in Quantum Chemistry

September 22, 2023

“A cord of three strands is not easily broken.”

—Ecclesiastes 4:12

Computational chemistry, like all attempts to simulate reality, is defined by tradeoffs. Reality is far too complex to simulate perfectly, and so scientists have developed a plethora of approximations, each of which reduces both the cost (i.e. time) and the accuracy of the simulation. The responsibility of the practitioner is to choose an appropriate method for the task at hand, one which best balances speed and accuracy (or to admit that no suitable combination exists).

This situation can naturally be framed in terms of Pareto optimality: there’s some “frontier” of speed/accuracy combinations which are at the limit of what’s possible, and then there are suboptimal combinations which are inefficient. Here’s a nice plot illustrating exactly that, from Dakota Folmsbee and Geoff Hutchinson (ref):

The y axis represents R² among different (relative) conformer energies, and the x axis is a log scale of computational time. The pattern shown here—exponential increases in time for linear increases in accuracy—is pretty common, unfortunately.

In this figure, the top left corner is the goal—perfect accuracy in no time at all—and the bottom right corner is the opposite. The diagonal line represents the Pareto frontier, and we can see that different levels of theory put you at different places along the frontier. Ab initio methods (DFT, MP2, coupled cluster) are slow but accurate, while force fields are fast but inaccurate, and semiempirical methods and ML methods are somewhere in the middle. (It’s interesting to observe that some ML methods are quite far from the optimal frontier, but I suppose that’s only to be expected from such a new field.)

An important takeaway from this graph is that some regions of the Pareto frontier are easier to access than others. Within e.g. DFT, it’s relatively facile to tune the accuracy of the method employed, but it’s much harder to find a method intermediate between DFT and semiempirical methods. (For a variety of reasons that I’ll write about later, this region of the frontier seems particularly interesting to me, so it’s not just an intellectual question.) This lacuna is what Stefan Grimme’s “composite” methods, the subject of today’s post, are trying to address.

I like to believe that these methods are named after the composite recurve bow, which is both smaller and more powerful than simple bows, but I don’t have evidence for this belief. Pictured is Qing dynasty artwork of Zhang Xian shooting a composite bow.

What defines a composite method? The term hasn’t been precisely defined in the literature (as far as I’m aware), but the basic idea is to strip down existing ab initio electronic structure methods as much as possible, particularly the basis sets, and employ a few additional corrections to fix whatever inaccuracies this introduces. Thus, composite methods still have the essential form of DFT or Hartree–Fock, but rely heavily on cancellation of error to give them better accuracy than one might expect. (This is in contrast to semiempirical methods like xtb, which start with a more approximate level of theory and layer on a ton of corrections.)

Grimme and coworkers are quick to acknowledge that their ideas aren’t entirely original. To quote from their first composite paper (on HF-3c):

Several years ago Pople noted that HF/STO-3G optimized geometries for small molecules are excellent, better than HF is inherently capable of yielding. Similar observations were made by Kołos already in 1979, who obtained good interaction energies for a HF/minimal-basis method together with a counterpoise-correction as well as a correction to account for the London dispersion energy. It seems that part of this valuable knowledge has been forgotten during the recent “triumphal procession” of DFT in chemistry. The true consequences of these intriguing observations could not be explored fully at that time due to missing computational resources but are the main topic of this work.

And it’s not as though minimal basis sets have been forgotten: MIDIX still sees use (I used it during my PhD), and Todd Martinez has been exploring these ideas for a while. Nevertheless, composite methods seem to have attracted attention in a way that the above work hasn’t. I’ll discuss why this might be at the end of the post—but first, let’s discuss what the composite methods actually are.

HF-3c (2013)

HF-3c is a lightweight Hartree–Fock method, using a minimal basis set derived from Huzinaga’s MINIS basis set. To ameliorate the issues that Hartree–Fock and a tiny basis set cause, the authors layer in three corrections: the D3 dispersion correction (with Becke–Johnson damping), their recent “gCP” geometric counterpoise correction for basis set incompleteness error, and an “SRB” short-range basis correction for electronegative elements.

HF-3c is surprisingly good at geometry optimization and noncovalent interaction energies (MAE of 0.39 kcal/mol on the S66 benchmark set), works okay for dipole moments and vibrational frequencies, but seems bad for anything involving bond breaking. Thus the authors recommend it for optimization of ground-state complexes, and not so much for finding an entire potential energy surface.

Comparison of HF-3c and PM6 for geometry optimization, relative to B3LYP-D3/def2-TZVPP (black).

(One complaint about all of these papers: the basis set optimization isn’t described in much detail, and we basically have to take the authors’ word that what they came up with is actually the best.)

HF-3c(v) (2014)

HF-3c(v) is pretty much the same as HF-3c, but it uses a “large-core” effective core potential to describe all of the core electrons, making it valence-only. This makes it 2–4 times faster, but also seems to make it much worse than HF-3c: I’m not sure the speed is worth the loss in accuracy.

The authors only use it to explore noncovalent interactions; I’m not sure if others have used it since.

PBEh-3c (2015)

PBEh-3c is the next “3c” method, and the first composite DFT method. As opposed to the minimal basis set employed in HF-3c, Grimme et al here elect to use a polarized double-zeta basis set, which “significantly improves the energetic description without sacrificing the computational efficiency too much.” They settle on a variant of def2-SV(P) with an effective core potential and a few other modifications, which they call “def-mSVP.”

As before, they also add the D3 and gCP corrections (both slightly modified), but they leave out the SRB correction. The biggest change is that they also reparameterize the PBE functional, which introduces an additional four parameters: three in PBE, and one to tune the percentage of Fock exchange. The authors note that increasing the Fock exchange from 25% to 42% offsets the error introduced by basis set incompleteness.

As before, the focus of the evaluation is on geometry optimization, and PBEh-3c seems to do very well (although not better than HF-3c on e.g. S66, which is surprising—the authors also don’t compare directly to HF-3c in the paper at all). PBEh-3c also does pretty well on the broad GMTKN30 database, which includes thermochemistry and barrier heights, faring just a bit worse than M06-2x/def2-SV(P).

HSE-3c (2016)

HSE-3c is basically the same as PBEh-3c, but now using a screened exchange variant to make it more robust and faster for large systems or systems with small band gaps. The authors recommend using PBEh-3c for small molecules or large band-gap systems, which is my focus here, so I won’t discuss HSE-3c further.

B97-3c (2018)

B97-3c is another DFT composite method, but it’s a bit different than PBEh-3c. PBE is a pretty simple functional with only three tunable parameters, while B97 is significantly more complex with ten tunable parameters. Crucially, B97 is a pure functional, meaning that no Fock exchange is involved, which comes with benefits and tradeoffs. The authors write:

The main aim here is to complete our hierarchy of “3c” methods by approaching the accuracy of large basis set DFT with a physically sound and numerically well-behaved approach.

For a basis set, the authors use a modified form of def2-TZVP called “mTZVP”, arguing that “as the basis set is increased to triple-ζ quality, we can profit from a more flexible exchange correlation functional.” This time, the D3 and SRB corrections are employed, but the gCP correction is omitted.

The authors do a bunch of benchmarking: in general, B97-3c seems to be substantially better than either PBEh-3c or HF-3c at every task, which isn’t surprising given the larger basis set. B97-3c is also often better than e.g. B3LYP-D3 with a quadruple-zeta basis set, meaning that it can probably be used as a drop-in replacement for most routine tasks.

Comparison of a few methods for geometry optimization, as assessed by rotational constants. Both B97-3c and PBEh-3c perform better than HF-3c.

B3LYP-3c (2020)

B3LYP-3c is a variant of PBEh-3c where you just remove the PBEh functional and replace it with B3LYP (without reparameterizing B3LYP at all). This is done to improve the accuracy for vibrational frequencies, since B3LYP performs quite well for frequencies. I’ve only seen this in one paper, so I’m not sure this will catch on (although it does seem to work).

r² SCAN-3c (2021)

Continuing our journey through the “Jacob’s ladder” of composite functionals, we arrive at r² SCAN-3c, based on the meta-GGA r² SCAN functional. No reparameterization of the base functional is performed, but the D4 and gCP corrections are added, and yet another basis set is developed: mTZVPP, a variant of the mTZVP basis set developed for B97-3c, which was already a variant of def2-TZVP.

The authors describe the performance of r² SCAN-3c in rather breathless terms:

…we argue that r² SCAN is the first mGGA functional that truly climbs up to the third rung of the Jacobs ladder without significant side effects (e.g., numerical instabilities or an overfitting behavior that leads to a bad performance for the mindless benchmark set).

…the new and thoroughly tested composite method r² SCAN-3c provides benchmark-accuracy for key properties at a fraction of the cost of previously required hybrid/QZ approaches and is more robust than any other method of comparable cost. This drastically shifts the aforementioned balance between the computational efficiency and accuracy, enabling much larger and/or more thorough screenings and property calculations. In fact, the robustness and broad applicability of r² SCAN-3c caused us to rethink the very structure of screening approaches.

The amount of benchmarking performed is a little overwhelming. Here’s a nice figure that summarizes r² SCAN-3c on the GMTKN55 database, and also compares it to B97-3c:

Notice how much better the results are than B97-3c.

And here’s a nice graph that shows time comparisons for a 153-atom system, which is something that’s obviously a key part of the value-add:

r² SCAN-3c is only a bit slower than B97-3c, and both are substantially faster than PBEh-3c (probably because PBEh is a global hybrid). HF-3c, of course, is still fastest.

ωB97X-3c (2022)

Finally, we come to ωB97X-3c, a composite range-separated hybrid functional derived from Mardirossian and Head-Gordon’s ωB97X-V functional (which seems to me to be one of the best DFT methods, period). ωB97X-3c reintroduces Fock exchange, so it’s significantly more expensive than r² SCAN-3c or B97-3c, but with this expense comes increased accuracy.

Interestingly, neither of the “weird” corrections (gCP or SRB) are employed for ωB97X-3c: it’s just an off-the-shelf unmodified functional, the now-standard D4 dispersion correction, and a specialized basis set. The authors acknowledge this:

Although ωB97X-3c is designed mostly in the spirit of the other “3c” methods, the meaning and definition of the applied “three corrections” have changed over the years. As before, the acronym stands for the dispersion correction and for the specially developed AO basis set, but here for the compilation of ECPs, which are essential for efficiency, as a third modification.

(But weren’t there ECPs before? Doesn’t even HF-3c have ECPs? Aren’t ECPs just part of the basis set? Just admit that this is a “2c” method, folks.)

The authors devote a lot of effort to basis-set optimization, because range-separated hybrids are so expensive that using a triple-zeta basis set like they did for B97-3c or r² SCAN-3c would ruin the speed of the method. This time, they do go into more details, and emphasize that the basis set (“vDZP”) was optimized on molecules and not just on single atoms:

Molecule-optimized basis sets are rarely used in quantum chemistry. We are aware of the MOLOPT sets in the CP2K code and the polarization consistent (pc-n) basis sets by Jensen. In the latter, only the polarization functions are optimized with respect to molecular energies. A significant advantage of molecular basis set optimizations is that, contrary to atomic optimizations, all angular momentum functions (i.e., polarization functions not occupied in the atomic ground state) can be determined consistently, as already noted by VandeVondele and Hutter.

They also put together a new set of effective core potentials, which also helps to minimize the number of basis functions. Even so, ωB97X-3c is the slowest of the composite methods, as shown in this figure:

Slow, but still faster than the competition.

In terms of accuracy, ωB97X-3c is far better than r² SCAN-3c, previously the best composite method, and is among the best-performing DFT methods in general for most benchmarks, although still outcompeted by ωB97X-V (and its close cousin ωB97X-D4, reparameterized in this work). The expense of ωB97X-3c means that for easy tasks like geometry optimization r² SCAN-3c is still probably a better choice, but for almost anything else it seems that ωB97X-3c is an excellent choice.

An interesting observation is that ωB97X-3c is not just Pareto-optimal, but close to optimal in an absolute sense. I initially framed the goal of composite methods as finding new ways to increase speed while decreasing accuracy: but here it seems that we can gain a ton of speed without losing much accuracy at all! This should be somewhat surprising, and the implications will be discussed later.

Overall Summary

Excluding some of the more specific composite methods, there seem to be three tiers here:

HF-3c, which is very fast but only reliable for geometry optimizations and pretty easy property calculations (e.g. dipole moment, rough frequencies). But also, geometry optimizations are ubiquitous, so this is still very important.
PBEh-3c, B97-3c, and r² SCAN-3c, which are all roughly the same speed despite substantial differences. Of these, PBEh-3c seems like the slowest and least accurate, and I’d be tempted to use either of the others first, especially r² SCAN-3c. (I’m curious about the tradeoffs that a mGGA functional has versus a GGA functional like B97-3c—are there pathological integration grid behaviors that B97-3c avoids? See Figure 6 here.)
ωB97X-3c, which seems to be in a class of its own, and better than almost everything else (not just other composites). That being said, it’s still new, and no one loves a method better than its own authors. We’ll have to see some real-world tests to make sure things are as promising as they seem.

Conclusions

After decades of theorists mocking people for using small basis sets, it’s ironic that intentionally embracing cancellation of error is trendy again. I’m glad to see actual theorists turn their attention to this problem: people have never stopped using “inadequate” basis sets like 6-31G(d), simply because nothing larger is practical for many systems of interest!

The results from this body of work suggest that current basis sets are far from optimal, too. The only piece of ωB97X-3c that’s new is the basis set, and yet that seems to make a huge difference relative to state-of-the-art. What happens if vDZP is used for other methods, like B97? The authors suggest that it might be generally effective, but more work is needed to study this further.

Update: Jonathon Vandezande and I investigated this question, and it turns out vDZP is effective with other density functionals too. You can read our preprint here!

Basis-set optimization seems like a “schlep” that people have avoided because of how annoying it is, or something which is practically useful but not very scientifically interesting or publishable. Perhaps the success of composite methods will push more people towards studying basis sets; I think that would be a good outcome of this research. It seems unlikely to me that vDZP cannot be optimized further; if the results above are any indication, the Pareto frontier of basis sets can be advanced much more.

I’m also curious if Grimme and friends would change HF-3c and the other early methods, knowing what they know now. Do better basis sets alleviate the need for gCP and SRB, or is that not possible with a minimal basis set? What about D3 versus D4 (which wasn’t available at the time)? Hopefully someone finds the time to go back and do this, because to my knowledge HF-3c sees a good amount of use.

Perhaps my favorite part of this work, though, is the ways in which composite methods reduce the degrees of freedom available to end users. “Classic” DFT has a ton of tunable parameters (functional, basis set, corrections, solvent models, thresholds, and so forth), and people frequently make inefficient or nonsensical choices when faced with this complexity. In contrast, composite methods make principled and opinionated choices for many of these variables, thus giving scientists a well-defined menu of options.

This also makes it easier for outsiders to understand what’s going on. I wrote about this over a year ago:

While a seasoned expert can quickly assess the relative merits of BYLP/MIDI! and DSD-PBEP86/def2-TZVP, to the layperson it’s tough to guess which might be superior… The manifold diversity of parameters employed today is a sign of [computational chemistry]’s immaturity—in truly mature fields, there’s an accepted right way to do things.

The simplicity of composite methods cuts down on the amount of things that you have to memorize in order to understand computational chemistry. You only have to remember “HF-3c is fast and sloppy,” rather than trying to recall how many basis functions pcseg-1 has or which Minnesota functional is good for main-group geometries.

So, I’m really optimistic about this whole area of research, and I’m excited that other labs are now working on similar things (I didn’t have space to cover everyone’s contributions, but here’s counterpoise work from Head-Gordon and atom-centered potentials from DiLabio). The next challenge will be to get these methods into the hands of actual practitioners…

EDA Scares Me

September 4, 2023

ICYMI: Ari and I announced our new company, Rowan! We wrote an article about what we're hoping to build, which you can read here. Also, this blog is now listed on The Rogue Scholar, meaning that posts have DOIs and can be easily cited.

Conventional quantum chemical computations operate on a collection of atoms and create a single wavefunction for the entire system, with an associated energy and possibly other properties. This is great, but sometimes we want to understand things in more detail. For instance, if we have a host A and two guests B_good and B_bad, a normal calculation would just tell us that E(A•B_good) is lower than E(A•B_bad), without giving any clue as to why.

Enter EDA. EDA, or “energy decomposition analysis,” is a family of techniques used to dissect interactions in a system with multiple molecules. In this case, an EDA calculation on the AB system would break down the interaction between A and B into various components, which could be used to help scientists understand the origin of the difference, and perhaps used for continued molecular design.

Unfortunately, EDA has always seemed like a pretty troubled technique to me. Wavefunctions are inherently not localized to individual fragments of a multimolecular system—you can’t just slice apart the molecular orbitals or the density matrix and end up with anything that’s physically sane. So you have to do some computational gymnastics to get energetic terms which are at all meaningful. Many such gymnastic workflows have been proposed, leading to a veritable alphabet soup of different EDA methods.

(It’s worth skimming this review on different EDA methods to get a sense for some of the questions the field faces, and also to laugh at how Alston Misquitta & Krzysztof Szalewicz use the review as a chance to relentlessly advertise SAPT and denigrate any and all competing methods.)

I’ll briefly outline how the EDA-NCOV method works for a system AB (following this review), to give a sense for the flavor of the field:

Optimized ground-state fragments A⁰ and B⁰ are distorted to the geometries and electronic states (A & B) which they possess in AB, and the energy required for this distortion/excitation is termed E_prep. (The difference between E(AB) and E(A) + E(B) is called E_int, and the total binding energy is equal to E_int + E_prep.)
The distorted fragments A and B are brought together (with frozen charge densities) to form the “promolecule” AB⁰, and the change in energy is termed E_elstat, the quasiclassical Coulomb interaction energy (typically attractive). The wavefunction for AB⁰ is Ψ^AΨ^B.
The product wavefunction Ψ^AΨ^B is antisymmetrized and renormalized to give an “intermediate state” Ψ⁰ with energy E⁰, and the change in energy is termed E_Pauli, originating from Pauli repulsion. This component is always repulsive.
Ψ⁰ is relaxed to yield the final wavefunction Ψ^AB. The change in energy is termed E_orb, because it arises from orbital interactions, and is always attractive.

Thus, E_int = E_elstat + E_Pauli + E_orb. (Dispersion can also be added if an exogenous dispersion correction is employed—that’s pretty trivial.)

The critical reader might observe that the steps taken to obtain these numbers are pretty odd, and that the components of the interaction energy arise from differences in energy between bizarre nonphysical states. Thus, the interpretation of terms like E_elstat in terms of actual physical interactions might not be as easy as it seems. The authors of the above review agree:

It is important to realize that the identification of the three major terms ΔE_elstat, ΔE_Pauli, and ΔE_orb with specific interactions is conceptually attractive but must not be taken as genuine expression of the physical forces.

Unfortunately, it seems that imprecise concepts familiar to experimental chemists like “steric repulsion” and “electrostatic attraction” have to be discarded in favor of precise terms like E_Pauli. Too bad they’re virtually uninterpretable!

And what’s worse is that different EDA-type schemes don’t even give the same results. A paper out today in JACS from Zare/Shaik discusses the use of EDA and related schemes in studying the origin of the hydrogen bond (a pretty fundamental question), motivated by the substantial disagreement between various techniques:

It is important to realize that different methods (e.g., BOVB, ALMO-EDA, NEDA, and BLW) do not fully agree with one another about whether the dominant stabilizing term is ΔE_POL or ΔE_CT in a particular HB.

While the authors make a good case that the sum of these two terms is relatively conserved across methods, and that it’s this term that we should care about for hydrogen bonds, the conclusions for EDA broadly are not encouraging. (Note, too, that E_POL and E_CT don’t even appear in the EDA-NCOV method summarized above—another reason that EDA is a frustrating field!)

How I feel about EDA, borrowing a meme from the AI discourse.

And even if the theorists eventually put their heads together and develop a version of EDA that doesn’t have these pitfalls, it’s still not clear that any form of EDA will give the answers that experimental chemists are looking for. Chemistry is complicated, and ground- or transition-state structures arise from a delicate equilibrium between opposing factors: steric repulsion, electrostatic attraction, bond distances, torsional strain, dispersion, &c.

As a result, one can see large changes in the contribution of individual factors even while the overall structure’s stability is minimally perturbed (enthalpy–entropy compensation is a classic example, as is Fig. 2 in this review on distortion–interaction analysis). Looking only at changes in individual factors isn’t always a useful way to gain insight from computation.

For example, imagine a nucleophile adding to two faces of an oxocarbenium, a bulky face and an unhindered face. Based on this description, we might expect to see that TS_bulky has higher steric repulsion than TS_unhindered (if we’re lucky enough to find a way to extract E_steric out of our EDA method). But it’s also likely that the nucleophile might take a less favorable trajectory towards the oxocarbenium in TS_bulky to avoid steric repulsion, thus weakening key orbital interactions. These changes might even end up being larger in magnitude than the destabilization induced by steric repulsion. Is the correct answer, then, that TS_bulky is higher in energy because of decreased E_orb, not increased E_steric?

The solution is to recognize that causation is not unique (cf. Aristotle), and so there’s no one right answer here. Within the constraints of the EDA framework, the theorist wouldn’t be incorrect in saying that E_orb is the driving factor—but the experimental chemist might reasonably expect “the bulky TS is destabilized by steric repulsion” as their answer, since this is the root cause of the changes between the two structures. (I side with the experimentalists here.)

And the precisely defined concepts favored by theorists are often hard for experimental scientists to work with. Even if the correct answer in the above scenario were “TS_bulky is destabilized by decreased orbital overlap”—what’s an experimentalist supposed to do with this information, add more orbitals? (This is how I feel about Trevor Hamlin’s work on Pauli repulsion.) The steric explanation at least suggests an intuitive solution: make the bulky group or the nucleophile smaller. If the purpose of EDA is to help people to understand intermolecular interactions better on a conceptual level, I’m not sure it’s succeeding in most cases.

(The only use of EDA that led to an actual experimental advance which I’m aware of is Buchwald/Peng Liu’s body of work on ligand–substrate dispersion in hydrocupration: study, new ligand, ligand from Hartwig. I don’t think it’s a coincidence that these papers focus on dispersion, one of the easiest pieces of EDA to decouple and understand.)

I don’t mean to be too critical here. The ability to break intermolecular interactions down into different components is certainly useful, and it seems likely that some version of EDA will eventually achieve consensus and emerge as a useful tool. But I think the utility of EDA even in the best case is pretty limited. Quantum chemistry is complicated, and if we think we can break it down into easy-to-digest components and eliminate the full nonlocal majesty of the Schrodinger equation, we’re lying to ourselves (or our experimental collaborators). Compute with caution!