EDA Scares Me

September 4, 2023

ICYMI: Ari and I announced our new company, Rowan! We wrote an article about what we're hoping to build, which you can read here. Also, this blog is now listed on The Rogue Scholar, meaning that posts have DOIs and can be easily cited.

Conventional quantum chemical computations operate on a collection of atoms and create a single wavefunction for the entire system, with an associated energy and possibly other properties. This is great, but sometimes we want to understand things in more detail. For instance, if we have a host A and two guests Bgood and Bbad, a normal calculation would just tell us that E(ABgood) is lower than E(ABbad), without giving any clue as to why.

Enter EDA. EDA, or “energy decomposition analysis,” is a family of techniques used to dissect interactions in a system with multiple molecules. In this case, an EDA calculation on the AB system would break down the interaction between A and B into various components, which could be used to help scientists understand the origin of the difference, and perhaps used for continued molecular design.

Unfortunately, EDA has always seemed like a pretty troubled technique to me. Wavefunctions are inherently not localized to individual fragments of a multimolecular system—you can’t just slice apart the molecular orbitals or the density matrix and end up with anything that’s physically sane. So you have to do some computational gymnastics to get energetic terms which are at all meaningful. Many such gymnastic workflows have been proposed, leading to a veritable alphabet soup of different EDA methods.

(It’s worth skimming this review on different EDA methods to get a sense for some of the questions the field faces, and also to laugh at how Alston Misquitta & Krzysztof Szalewicz use the review as a chance to relentlessly advertise SAPT and denigrate any and all competing methods.)

I’ll briefly outline how the EDA-NCOV method works for a system AB (following this review), to give a sense for the flavor of the field:

  1. Optimized ground-state fragments A0 and B0 are distorted to the geometries and electronic states (A & B) which they possess in AB, and the energy required for this distortion/excitation is termed Eprep. (The difference between E(AB) and E(A) + E(B) is called Eint, and the total binding energy is equal to Eint + Eprep.)
  2. The distorted fragments A and B are brought together (with frozen charge densities) to form the “promolecule” AB0, and the change in energy is termed Eelstat, the quasiclassical Coulomb interaction energy (typically attractive). The wavefunction for AB0 is ΨAΨB.
  3. The product wavefunction ΨAΨB is antisymmetrized and renormalized to give an “intermediate state” Ψ0 with energy E0, and the change in energy is termed EPauli, originating from Pauli repulsion. This component is always repulsive.
  4. Ψ0 is relaxed to yield the final wavefunction ΨAB. The change in energy is termed Eorb, because it arises from orbital interactions, and is always attractive.

Thus, Eint = Eelstat + EPauli + Eorb. (Dispersion can also be added if an exogenous dispersion correction is employed—that’s pretty trivial.)

The critical reader might observe that the steps taken to obtain these numbers are pretty odd, and that the components of the interaction energy arise from differences in energy between bizarre nonphysical states. Thus, the interpretation of terms like Eelstat in terms of actual physical interactions might not be as easy as it seems. The authors of the above review agree:

It is important to realize that the identification of the three major terms ΔEelstat, ΔEPauli, and ΔEorb with specific interactions is conceptually attractive but must not be taken as genuine expression of the physical forces.

Unfortunately, it seems that imprecise concepts familiar to experimental chemists like “steric repulsion” and “electrostatic attraction” have to be discarded in favor of precise terms like EPauli. Too bad they’re virtually uninterpretable!

And what’s worse is that different EDA-type schemes don’t even give the same results. A paper out today in JACS from Zare/Shaik discusses the use of EDA and related schemes in studying the origin of the hydrogen bond (a pretty fundamental question), motivated by the substantial disagreement between various techniques:

It is important to realize that different methods (e.g., BOVB, ALMO-EDA, NEDA, and BLW) do not fully agree with one another about whether the dominant stabilizing term is ΔEPOL or ΔECT in a particular HB.

While the authors make a good case that the sum of these two terms is relatively conserved across methods, and that it’s this term that we should care about for hydrogen bonds, the conclusions for EDA broadly are not encouraging. (Note, too, that EPOL and ECT don’t even appear in the EDA-NCOV method summarized above—another reason that EDA is a frustrating field!)

How I feel about EDA, borrowing a meme from the AI discourse.

And even if the theorists eventually put their heads together and develop a version of EDA that doesn’t have these pitfalls, it’s still not clear that any form of EDA will give the answers that experimental chemists are looking for. Chemistry is complicated, and ground- or transition-state structures arise from a delicate equilibrium between opposing factors: steric repulsion, electrostatic attraction, bond distances, torsional strain, dispersion, &c.

As a result, one can see large changes in the contribution of individual factors even while the overall structure’s stability is minimally perturbed (enthalpy–entropy compensation is a classic example, as is Fig. 2 in this review on distortion–interaction analysis). Looking only at changes in individual factors isn’t always a useful way to gain insight from computation.

For example, imagine a nucleophile adding to two faces of an oxocarbenium, a bulky face and an unhindered face. Based on this description, we might expect to see that TSbulky has higher steric repulsion than TSunhindered (if we’re lucky enough to find a way to extract Esteric out of our EDA method). But it’s also likely that the nucleophile might take a less favorable trajectory towards the oxocarbenium in TSbulky to avoid steric repulsion, thus weakening key orbital interactions. These changes might even end up being larger in magnitude than the destabilization induced by steric repulsion. Is the correct answer, then, that TSbulky is higher in energy because of decreased Eorb, not increased Esteric?

The solution is to recognize that causation is not unique (cf. Aristotle), and so there’s no one right answer here. Within the constraints of the EDA framework, the theorist wouldn’t be incorrect in saying that Eorb is the driving factor—but the experimental chemist might reasonably expect “the bulky TS is destabilized by steric repulsion” as their answer, since this is the root cause of the changes between the two structures. (I side with the experimentalists here.)

And the precisely defined concepts favored by theorists are often hard for experimental scientists to work with. Even if the correct answer in the above scenario were “TSbulky is destabilized by decreased orbital overlap”—what’s an experimentalist supposed to do with this information, add more orbitals? (This is how I feel about Trevor Hamlin’s work on Pauli repulsion.) The steric explanation at least suggests an intuitive solution: make the bulky group or the nucleophile smaller. If the purpose of EDA is to help people to understand intermolecular interactions better on a conceptual level, I’m not sure it’s succeeding in most cases.

(The only use of EDA that led to an actual experimental advance which I’m aware of is Buchwald/Peng Liu’s body of work on ligand–substrate dispersion in hydrocupration: study, new ligand, ligand from Hartwig. I don’t think it’s a coincidence that these papers focus on dispersion, one of the easiest pieces of EDA to decouple and understand.)

I don’t mean to be too critical here. The ability to break intermolecular interactions down into different components is certainly useful, and it seems likely that some version of EDA will eventually achieve consensus and emerge as a useful tool. But I think the utility of EDA even in the best case is pretty limited. Quantum chemistry is complicated, and if we think we can break it down into easy-to-digest components and eliminate the full nonlocal majesty of the Schrodinger equation, we’re lying to ourselves (or our experimental collaborators). Compute with caution!



If you want email updates when I write new posts, you can subscribe on Substack.