The concept of pKa is introduced so early in the organic chemistry curriculum that it’s easy to overlook what a remarkable idea it is.
Briefly, for the non-chemists reading this: pKa is defined as the negative base-10 logarithm of the acidity constant of a given acid H–A:
pKa := -log10([HA]/[A-][H+])
Unlike pH, which describes the acidity of a bulk solution, pKa describes the intrinsic proclivity of a molecule to shed a proton—a given molecule in a given solvent will always have the same pKa, no matter the pH. This makes pKa a very useful tool for ranking molecules by their acidity (e.g. the Evans pKa table).
The claim implicit in the definition of pKa is that a single parameter suffices to describe the acidity of each molecule.1 In general, this isn’t true in chemistry—there’s no single “reactivity” parameter which describes how reactive a given molecule is. For various regions of chemical space a two-parameter model can work, but in general we don’t expect to be able to evaluate the efficacy of a given reaction by looking up the reactivity values of the reactants and seeing if they’re close enough.
Instead, structure and reactivity interact with each other in complex, high-dimensional ways. A diene will react with an electron-poor alkene and not an alcohol, while acetyl chloride doesn’t react with alkenes but will readily acetylate alcohols, and a free radical might ignore both the alkene and the alcohol and abstract a hydrogen from somewhere else. Making sense of this confusing morass of different behaviors is, on some level, what organic chemistry is all about. The fact that the reactivity of different functional groups depends on reaction conditions is key to most forms of synthesis!
But pKa isn’t so complicated. If I want to know whether acetic acid will protonate pyridine in a given solvent, all I have to do is look up the pKa values for acetic acid and pyridinium (pyridine’s conjugate acid). If pyridinium has a higher pKa, protonation will be favored; otherwise, it’ll be disfavored. More generally, one can predict the equilibrium distribution of protons amongst N different sites from a list of the corresponding pKas.
Why is pKa so well-behaved? The key assumption underlying the above definition is that ions are free and do not interact with one another. This allows us to neglect any specific ion–ion interactions, and makes the scale universal: if the pyridinium cation and the acetate anion never interact, then I can learn everything I need to about pyridinium acetate just by measuring the pKas of pyridine and acetic acid in isolation.
This assumption is quite good in solvents like water or DMSO, which excel at stabilizing charged species, but progressively breaks down as one travels to the realm of nonpolar solvents. As ions start to pair with themselves, specific molecule–molecule interactions become important. The relative size of the anions can matter: in a nonpolar solvent, a small anion will be better stabilized by a small cation than by a large, diffuse cation, meaning that e.g. acetate will appear more acidic when protonating smaller molecules. Other more quotidian intermolecular interactions, like hydrogen bonding and π-stacking, can also play a role.
And the ions aren’t the only thing that can stick together: aggregation of acids is often observed in nonpolar solvents. Benzenesulfonic acid forms a trimer in benzonitrile solution, which is still pretty polar, and alcohols and carboxylic acids are known to aggregate under a variety of conditions as well.2 Even seemingly innocuous species like tetrabutylammonium chloride will aggregate at high concentrations (ref, ref).
To reliably extend pKa scales to nonpolar solvents, one must thus deliberately choose compounds which resist aggregation. As the dielectric constant drops, so does the number of such compounds. The clearest demonstration of this I’ve found is a series of papers (1, 2) by pKa guru Ivo Leito measuring the acidity of a series of fluorinated compounds in heptane:
This effort, while heroic, demonstrates the futility of measuring pKa in nonpolar media from the standpoint of the synthetic chemist. If only weird fluoroalkanes engineered not to aggregate can have pKa values, then the scale may be analytically robust, but it’s hardly useful for designing reactions!
The key point here is that the difficulty of measuring pKa in nonpolar media is not an analytical barrier which can be surmounted by new and improved technologies, but rather a fundamental breakdown in the idea of pKa itself. Even the best pKa measurement tool in the world can’t determine the pKa of HCl in hexanes, because no such value exists—the concept itself is borderline nonsensical. Chloride will ion-pair with everything in hexanes, hydrogen chloride will aggregate with itself, chloride will stick to hydrogen chloride, and so forth. Asking for a pKa in this context just doesn't make much sense.3
It’s important to remember, however, that just because the pKa scale no longer functions in nonpolar solvents doesn’t mean that acids don’t have different acidities. Triflic acid in toluene will still protonate just about everything, whereas acetic acid will not. Instead, chemists wishing to think about acidity in nonpolar media have to accept that no one-dimensional scale will be forthcoming. The idealized world of pKa we’re accustomed to may no longer function in nonpolar solvents, but chemistry itself still works just fine.
Thanks to Ivo Leito for discussing these topics with me over Zoom, and to Joe Gair for reading a draft of this post.I’ve been pretty critical of peer review in the past, arguing that it doesn’t accomplish much, contributes to status quo bias, etc. But a few recent experiences remind me of the value that peer review provides: in today’s scientific culture, peer review is essentially the only time that scientists get honest and unbiased feedback on their work.
How can this be true? In experimental science, scientists typically work alongside other students and postdocs under the supervision of a professor. This body of people forms a lab, also known as a research group, and it’s to these people that you present most frequently. Your lab generally knows the techniques and methods that you employ very well: so if you’ve misinterpreted a piece of data or designed an experiment poorly, group meeting is a great place to get feedback.
But a lab is also biased in certain ways. People are attracted to a lab because they think the science is exciting and shows promise, and so they’re likely to be credulous about positive results. Certain labs also develop beliefs or dogmas about how to conduct science: the best ways to perform a mechanistic study, or the most useful reaction conditions. To some extent, every lab is a paradigm unto itself. This means that paradigm-shifting criticism is hard to find among one’s coworkers, even if it’s common in the outside world.
Here are some examples of controversial-in-the-field statements that are unlikely to be controversial within given labs:
In each of these cases, it’s unlikely that criticism along these lines is available internally: people who’ve chosen to do their PhDs studying ML in chemistry aren’t likely to criticize your paper for overemphasizing the importance of ML in chemistry!
More generally, internal criticism works best when a lab serves as a shared repository of expertise, i.e. when everyone in the lab has roughly the same skillset. Some labs focus instead on a single overarching goal and employ many different tools to get to that point: a given chemical biology group might have a synthetic chemist, a MS specialist, a genomics guru, a mechanistic enzymologist, and someone specializing in cell culture. If this is the case, your techniques are opaque to your coworkers: what advice can someone who does cell culture give about improving Q-TOF signal-to-noise?
Ideally, one’s professor is well-versed enough in each of the techniques employed that he or she can dispense criticism as needed. But professors are often busy, aren’t always operational experts at each of the techniques they oversee, and suffer from the same viewpoint biases that their students do (perhaps even more so).
So, it’s important to solicit feedback from external sources. Unfortunately, at least in my experience most external feedback is too positive: “great talk,” “nice job,” etc. Our scientific culture tries so hard to be supportive that I almost never get any meaningful criticism from people outside my group, either publicly or privately. (Ideally one’s committee would help, but I never really got to present research results to my committee, and this doesn’t help postdocs anyhow.)
Peer review, then, serves as the last bastion against low-quality science: reviewers are outside the lab, have no incentive to be nice, and are tasked specifically with poking holes in your argument or pointing out extra experiments that would improve it. Peer review has improved each one of my papers, and I’m grateful for it.1
What’s a little sad is that the excellent feedback that reviewers give only comes at the bitter end of a project, which for me has often meant that the results are more than a year old and my collaborators have moved on. Much more useful would be critical feedback delivered early on in a project, when my own thinking is more flexible and the barrier to running additional experiments is lower. And more useful still would be high-quality criticism available at every step of the project, given not anonymously but by people whom you can talk to and learn from.
What might this practically look like?
I don’t know what the right solution looks like here: the burden of peer review is already substantial, and I don’t mean to suggest that this work ought to be arbitrarily multiplied for free. But I do worry that eliminating peer review, absent other changes, would simply mean that one of the only meaningful chances to get unfiltered feedback on one’s science would be eliminated, and that this would be bad.
Thanks to Croix Laconsay and Lucas Karas for helpful feedback on this piece.
I first encountered organic chemistry on Wikipedia, my freshman year of high school. The complexity and arcanity of the field instantly attracted me: here was something interesting that I didn’t know about and which didn’t require years of mathematical training to approach (unlike most of physics).
I soon started reading about organic chemistry more and more, albeit with no rhyme or reason to my study. I didn’t know what the good textbooks were, what order to study things in or which concepts ought to be understood in depth before progressing further. Organic chemistry was just a “glass bead game” to me, an art of symbols devoid of any real-world representations. But enthusiasm can sometimes suffice where wisdom is lacking.
With the support of a teacher and a half-dozen friends who also wanted to learn more organic chemistry, we started a little independent study. We met in a closet and read a textbook, worked through the problems, and our teacher wrote us tests. We eventually managed to get through all of Paula Bruice, although various misconceptions (and mispronunciations) stayed with me until I took Movassaghi’s course at MIT.
But in high school we had no lab, and thus no practical knowledge. We split into two groups and tried to come up with experiments for ourselves, but the results were dismal. Here’s the procedure (copied verbatim) for the one and only reaction we ran, synthesis of nitrobenzene via nitration of benzene, which was to be the first step in a multistep synthesis of Kevlar:
In a 250 mL beaker, dissolve benzene in a solution of concentrated H2SO4 of twice its volume. Use an ice-salt bath to bring this solution to 0 °C or below, and use a Pasteur pipette to slowly add a 1:1 solution of HNO3 and H2SO4 (be sure to keep the solution below 15 degrees Celsius at all times, as the reaction is strongly exothermic). Once all the solution has been added, warm it to room temperature and allow it to sit for 15 minutes.
Pour the solution over 50 g of crushed ice in a 250 mL beaker. Once the ice has melted, isolate the product via vacuum filtration via a Buchner funnel and rinse it twice with water and twice with methanol. Recrystallize in a solution of methanol.
The astute observer will notice that there aren’t very many details here. How much benzene? How much nitric acid? We didn’t have the glassware mentioned above—neither a beaker nor a Buchner funnel. And, perhaps most damningly, the procedure calls for isolation by filtration, challenging since nitrobenzene is a liquid at room temperature.
Despite these problems, we successfully ran this reaction (open to air, not in a fume hood), and obtained the product. I vividly recall the yellow bubbles of nitrobenzene floating to the top of the vial, and the smell of cherries that filled the room, a smell that returns to me in Proustian fashion from time to time when using certain reagents. We didn’t have a separatory funnel (or we didn’t know how to use it if we did), so we fished some of the nitrobenzene out with a Pasteur pipette and threw the rest away.
A little bit of knowledge would have served us well, as would have gloves and a fume hood. Here’s Wikipedia:
Prolonged exposure [to nitrobenzene] may cause serious damage to the central nervous system, impair vision, cause liver or kidney damage, anemia and lung irritation. Inhalation of vapors may induce headache, nausea, fatigue, dizziness, cyanosis, weakness in the arms and legs, and in rare cases may be fatal…
Nitrobenzene is considered a likely human carcinogen by the United States Environmental Protection Agency, and is classified by the IARC as a Group 2B carcinogen which is "possibly carcinogenic to humans".
Indeed, a coworker of mine would later be sent to the hospital after spilling nitrobenzene on herself. Surprisingly, my group’s experiment still ended up being the safer one—the lab portion of our course was disbanded the following day after my classmates caused an explosion with thionyl chloride.
I was fortunate enough to land a summer research internship in the Sessler lab at UT the following summer, and I started studying chemistry in earnest: column chromatography, Anslyn/Dougherty, NMR spectroscopy, and all the rest. I remember the first reaction I ran at UT: retro-Friedel–Crafts dealkylation of a calix[4]arene, using about 10 g each of phenol and aluminum(III) chloride. The brutal physicality of lab work was a nice contrast to the gnosticism of software (where I’d worked previously), and I was hooked.
I’ll fast-forward through the more recent parts of my chemical career: I went to MIT and joined the Imperiali lab to work on essentially a medicinal chemistry project: hit-to-lead optimization, featuring lots of de novo heterocycle synthesis. I got to cook reactions in molten urea, quench 500 mL of phosphorus oxychloride at a time, and even design new routes (with a little oversight from my postdoc). It was awesome.
After three semesters, I got tired of my cross couplings mysteriously failing and joined the Buchwald lab, where I learned how to do chemistry more carefully: handling air-sensitive materials with a Schlenk line or in the glovebox, not “as fast as possible.” My tenure in the Buchwald lab also introduced me to the importance of computations, which became a key part of my doctoral work, particularly when we were sent home in March of my first year. COVID gave me the opportunity to pursue some software engineering projects that I wouldn’t have had time to work on otherwise (like cctk and presto), and simulation started to take up more and more of my time and intellectual bandwidth.
I defended my dissertation on June 5th, and cleaned out my hood last week. For the foreseeable future, I’ll be a purely computational chemist—computation is advancing quickly, and I think that’s where I have the most to offer the field right now. But I’ll miss the sights and sounds of the lab. There’s a satisfaction to making a new molecule and holding the final product in your hands: the knowledge that you’ve reshaped this little corner of reality through your own actions, and that this particular arrangement of matter has never existed before.
And simulation is, at the end of the day, only useful insofar as it helps us make real molecules. There may be people who wish to model reactions purely for the sake of modeling them, but I am not one of them. What drew me to simulation initially, and what still attracts me to the field today, is its potential to help experimental scientists do their work faster and better. It was easy for me to ensure that this was true when I was both doing the experiments and running the simulations; my incentives were aligned properly. It will be harder in the future.
There’s a seductive appeal in leaving lab work behind altogether, too, and one that’s dangerous. Any experimentalist who’s worked with a computational collaborator knows that nothing ever works precisely as modeled. There are untold depths of chemical behavior still inaccessible to the idealized world of simulation, and it’s all too easy for computational chemists just to look the other way. Life is easier when you don’t have to deal with sludgy workups or poorly soluble intermediates, but they don’t go away just because you can’t model them by DFT—reality has a way of keeping us honest that simulations frequently lack.
So, although I bid lab farewell at this point in my career, it’s a bittersweet parting. I hope to return someday; only time will tell.
Thanks to Jacob Thackston for reading a draft of this piece.I started this blog one year ago today, with a post on site-selective glycosylation. According to Google Analytics, there have been 24,035 views since then.
What have the top posts been?
The only of these that really surprises me is #5: the 13C NMR post made a lot of organic chemists really angry, the Talent review was reposted on Marginal Revolution, and Delian Asparouhov (CEO of Varda) retweeted my post about Varda’s crystallization ideas. And everyone loves to share a ranking of the year’s papers, especially when their own work is highlighted.
The least-viewed posts?
Twitter downranked the Substack post pretty heavily, so it’s not surprising that nobody saw it. The Lennard–Jones posts are more unexpected. Whenever I write about anything computational or coding-related, it seems to attract much less engagement, which is perhaps a reflection of the fact that most of my followers are experimental chemists who don’t really care about obfuscated C code.
A year in, writing blog posts has gotten much easier. The following advice from Alexey Guzey didn’t seem true when I started, but it does seem true now:
Writing not only helps you to understand what’s going on and to crystallize your thoughts, it actually makes you think of new ideas and come up with solutions to your problems.
I’ve fallen into a 1x/week update schedule, which seems to work pretty well: enough to keep the routine up, but not so much that it’s a serious distraction from my actual job. I hope to maintain this schedule for the foreseeable future, and recommend it to other bivocational bloggers.
Anyhow, thanks for reading!
(Also, today in off-blog content: I appeared on my first podcast, Forbidden Conversations, hosted by Harry Wetherall. We talk about why people don’t have kids earlier, how I reconcile being a Christian with being a scientist, the concept of “cope,” and more: you can check it out on Apple Podcasts or Spotify.)