Knowable Magazine · Structural biology: How proteins got their close-up


Listen on: Amazon Music | Apple Podcasts | Deezer | Google Podcasts | LibSyn | Player.FM | Soundcloud | Spotify | Stitcher


Every living thing, from bacteria to our own bodies, is made up of cells. And those cells are built from four kinds of large biological molecules: carbohydrates, fats, nucleic acids (that’s DNA and RNA) and proteins. These vital components of life are too small to be seen by the naked eye, or even by a light microscope. So even though 19th century scientists knew these “invisible” molecules were there — and they could do experiments to work out their chemical constituents — they couldn’t see them; they couldn’t make out their shapes in any detail. This is the story of how the invisible became visible in the 20th century. 

It’s the story of a long, laborious slog to develop the tools and the techniques that would reveal the structure of biological molecules — and how seeing the structure of these molecules enabled us to understand how they work and to design drugs that block or enhance their actions.

This is Knowable and I’m Charlotte Stoddart.

To tell this story, we’re focusing on proteins. These large molecules facilitate just about every chemical process in our bodies: They “read” the genetic code, they catalyze reactions, they act as the gatekeepers to our cells. Proteins are made up of chains of small molecules called amino acids. Knowing how these chains fold up to create a three-dimensional structure is crucial, because it’s the 3D shape of proteins that determines how they work.

To create an accurate 3D model of a protein, we need to know the arrangement in space of all of the atoms in all of the amino acids that make up that protein. We can’t see atoms because they’re smaller than the wavelength of visible light. To detect them, we need a different kind of wave — a wave with a shorter wavelength and one that can penetrate surfaces to show us not just the atoms on the outside but also the atoms within the protein.

And so our story begins with the discovery of X-rays in a university town called Würzburg, in Germany.

It’s 1895 and Wilhelm Röntgen is in the lab. Like many physicists of his generation, he’s experimenting with cathode rays — streams of electrons produced in a device called a Crookes tube. But unlike his contemporaries, Röntgen notices something unexpected: a screen quite some distance from the Crookes tube is glowing — too far away to have been caused by cathode rays, he thinks. Over the next weeks he studies this glowing fluorescence and realizes that he’s found a new kind of ray that can penetrate solid objects. Just before Christmas, he brings his wife into the lab to take a photograph of her hand. In the photograph, her bones and ring — but not her flesh — are clearly visible.

Röntgen writes an account of his findings and, in early 1896, an English translation is published in the journal Nature:

It is seen, therefore, that some agent is capable of penetrating black cardboard which is quite opaque to ultra-violet light, sunlight, or arc-light. It is therefore of interest to investigate how far other bodies can be penetrated by the same agent.

The report continues:

Thick blocks of wood are still transparent. Boards of pine two or three centimetres thick absorb only very little. A piece of sheet aluminium, 15 mm. thick, still allowed the X-rays (as I will call the rays, for the sake of brevity) to pass, but greatly reduced the fluorescence.

Röntgen’s discovery had an immediate impact. Within months, doctors were using X-rays to photograph broken bones. Poems were written about them and the “wondrous” X-rays became a popular attraction at exhibitions. And, in 1901, Röntgen was awarded the very first Nobel Prize in Physics for his discovery — the first of many Nobel Prizes awarded to scientists in this story.

Meanwhile, back in labs, physicists puzzled over the nature of X-rays: Were they waves or particles? If X-rays were waves, reasoned Max von Laue, another German physicist, then their wavelength might be similar to the regular spaces between atoms in a crystal, providing a way to decipher the structure of crystals. This was a very important insight. It led to the development of X-ray crystallography, the technique that would eventually enable scientists to figure out the structure of crystallized proteins. But it took several decades to get to that point. At first, X-ray crystallography was applied to much smaller molecules. And before that, the technique itself had to be figured out.

In the summer of 1912, mathematician and physicist William Bragg and his son Lawrence — also a physicist — were on holiday by the coast in Britain when they heard about a lecture given by von Laue. After the holiday, father and son returned to their universities and thought about the diffraction of X-rays by crystals. Later that year, William Bragg wrote to the journal Nature. He began by describing the remarkable effects obtained by passing…

… a fine stream of X-rays through a crystal before incidence upon a photographic plate. A curious arrangement of spots is found upon the plate, some of them so far removed from the central spot that they must be ascribed to rays which make large angles…

These are the X-rays that are scattered by the atoms in the crystal, causing a distinctive pattern of spots on the photographic plate.

The positions of these spots seem to depend on simple numerical relations, and on the mode in which the crystal presents itself to the incident stream. I find that when the crystal (zincblende) is placed so that the incident rays are parallel to an edge of the cube in the crystal the positions of the spots are to be found by the following simple rule. The atoms being assumed to be arranged in rectangular fashion, any direction which joins an atom to a neighbour at a distance na from it, where a is the distance from the atom to the nearest neighbours and n is a whole number…

The mathematical rule hit upon by the Braggs provided a way to interpret the diffraction patterns produced by the X-rays, thus revealing the arrangement of atoms in the crystal.

William Bragg devised a new, more powerful method for carrying out X-ray diffraction, inventing an instrument called the X-ray spectrometer.

In 1914 von Laue was awarded a Nobel Prize for his work. The following year, William and Lawrence also got the gong. Lawrence, only 25 at the time, is still the youngest scientist to receive a Nobel Prize.

At first, the Bragg method was applied to simple substances such as table salt, benzene and sugar molecules, revealing the secrets of their structures. Many scientists were skeptical that something as complicated as a protein structure could ever be determined in this way. In 1936, the progress of X-ray studies was discussed in the Annual Review of Biochemistry.

For such crystalline substances as the sugars and amino acids complete knowledge of the crystal structure would show the arrangement of the atoms within the molecule as well as the arrangement of the molecules within the crystal; but for substances such as the polysaccharides and the proteins, in which a less regular arrangement of the atoms is accompanied by the lack of a common crystalline appearance, such complete knowledge is not to be hoped for.

But a few years later, in 1939, a more optimistic view was put forward. Techniques like X-ray crystallography, the author noted, were changing biology profoundly. The author seems quite giddy as he considers the possibilities.

Biology is fast becoming a molecular science, a desire to tread as far as possible the friendly ground of physics and chemistry and see where it leads. It may be that the angels are right, but it is good to feel and take part in a foolishness that is the scientific hall-mark of our times. The search is now for the structure and arrangement of the molecules of living things. Chief among these molecules are the proteins, and the greatest ex­citement these days is about the proteins.

To tackle proteins, several advances were needed: better ways of coaxing proteins into crystals; new mathematical methods for interpreting diffraction patterns; and computers for crunching the data. Scientists in Cambridge in the UK were working on all of these challenges.

In 1953 the field got a boost when X-ray crystallography was used to solve an extremely significant structure. It wasn’t a protein — it was DNA, for which James Watson, Francis Crick and Maurice Wilkins later received a Nobel Prize.

Working alongside Watson and Crick in Cambridge was John Kendrew, a highly motivated researcher who was determined to solve the structure of the protein myoglobin. Myoglobin is the protein that holds oxygen in muscles. Kendrew chose it because it’s not too big. His first challenge was to grow crystals suitable for X-ray analysis. After trying to crystallize myoglobin from horse, porpoise, seal, dolphin, penguin, tortoise and carp, he finally managed to grow beautiful crystals of myoglobin extracted from sperm whale meat. 

Meanwhile, Kendrew’s colleague, Max Perutz, developed a technique for adding “heavy” atoms to protein molecules. The heavy atoms don’t change the structure of the protein, but they provide a frame of reference for comparing X-ray photographs taken from different angles. After years of work, Kendrew still didn’t know the precise position of every single atom in myoglobin, but he finally knew enough to make a 3D model of the protein. It wasn’t as pretty as DNA’s double helix; it looked more like a coiled sausage.

It was around this time that Richard Henderson joined the group. Henderson is still working on protein structure determination in Cambridge today and is known for pioneering new techniques, which we’ll hear about later. But back then he’d just graduated and was looking for a PhD position. He remembers traveling from Edinburgh to Cambridge to visit the lab:

Richard Henderson: “They had an open day, which was Saturday morning, and they’re all working! Whereas everywhere else I’d been, you know, they went home or they weren’t highly motivated. So I said, ‘Oh, this is a very good lab.’”

Henderson joined the hardworking team in Cambridge. The work was exciting but extremely slow.

Richard Henderson: “They got the myoglobin structure at very high resolution, 1959, really, 1960 published, and then there wasn’t another structure for five years, which was lysozyme at the Royal Institution in London. Then after that, it was another three years until the third structure.”

The researchers put in long hours, so why was progress so slow? The small molecules that X-ray crystallographers had worked on first — things like benzene and sugar rings — contained fewer than 50 atoms. By contrast, myoglobin, a relatively small protein, contains over a thousand atoms. To figure out the position of that many atoms, they had to take hundreds of X-ray photographs, measure the intensity of each spot in each photograph and perform tedious calculations. It was a massive data-handling challenge.

Richard Henderson: “In my PhD, I took about 300 of these procession photographs, and initially you had to measure them by hand: So you put the film in a film scanner, and a beam of light moved along the row of spots, and then you got every, say, three minutes, you got a piece of paper with the trace on, with maybe 40 spots on it, and you measured the strength of the diffraction spot with a ruler on a piece of paper and then you typed that number onto a computer paper — and that was just one row of spots.”

It was hugely time-consuming. Researchers gradually figured out how to automate parts of the process, inventing automatic x-ray detectors and instruments to speed up the measurement of spots. Kendrew realized that the calculations needed to solve a structure might be done by a computer. Fortuitously, one of the first electronic computers with a stored memory program had just been built in the Cambridge Mathematics Lab. It was known as EDSAC, and Kendrew learnt how to program it. As more powerful computers became available, the X-ray crystallographers made use of them. Henderson recalls that in the 1960s, they traveled to London to use the IBM 7090 at Imperial College. The Cambridge team had access to this computer for 1 hour a day.

Richard Henderson: “And so every afternoon at 4 o’clock a taxi came and took somebody to the train station in Cambridge with boxes of punched computer cards. They got on the train to London, got on the Underground, walked in the tunnel between South Kensington Station and Imperial College — there was about half a mile or so — carrying all these heavy boxes. And then from 7 till 8 o’clock in the evening the MRC programs from Cambridge were run on the computer and then the person taking it — and most of them were young women who’d been recruited; they were called ‘computer girls’ at the time, they’re all now computer managers, they’ve done really well — they would bring the paper output back. And the next morning at 9 a.m., everybody would examine their work from the previous day, and get ready for the 4 p.m. run.”

No wonder this was slow work! Women weren’t only carrying boxes of computer code across London, they were also doing X-ray crystallography. At King’s College London, Rosalind Franklin produced X-ray diffraction patterns of DNA. Her pictures enabled Watson and Crick to make their famous model. In Oxford, Dorothy Hodgkin solved the structure of penicillin and later worked on other medically important molecules, including vitamin B12 and insulin. She was awarded a Nobel Prize in 1964. Yet another Nobel Prize for the field!

As more computers became available and computing power increased, more structures were solved. Continuing advances in computers is another theme to which we will return.

Excitement about the new field of structural biology was growing. Some scientists believed that eventually they wouldn’t even need X-ray crystallography to figure out the structure of proteins.

Hopes have even been raised that it will someday be possible to deduce conformations solely from amino acid sequence.

That was written in 1965, in the Annual Review of Biochemistry. The idea was that if you knew the sequence of amino acids in the unfolded protein chain, then by following simple rules governing how atoms and molecules interact, you could work out how the chain would fold up.

Chemist Christian Anfinsen repeated this claim in his Nobel Prize lecture in 1972: 

Empirical considerations of the large amount of data now available on correlations between sequence and three dimensional structure, together with an increasing sophistication in the theoretical treatment of the energetics of polypeptide chain folding are beginning to make more realistic the idea of the a priori prediction of protein conformation.

It was an attractive idea. If computers could be programmed with the rules of protein folding and amino acid sequences inputted, then structures might be solved in days rather than years, providing an alternative to expensive and time-consuming experimental methods.

But not yet. For something like that to happen, biologists first had to solve the structures of a lot more proteins by using and improving X-ray crystallography. And by inventing new ways of seeing proteins. And this work would lead to more Nobel Prizes.

In the final weeks of 1999, biochemist Roger Kornberg was reaching the culmination of over a decade of work. He was at the Stanford Synchrotron Radiation Laboratory, getting results that would at last show him the structure of the protein he’d been working on.

Roger Kornberg: “When we began, it was far from clear that it could be done. It was, of course, cause for relief from the fear we would perhaps never succeed, and exhilaration at the final result.”

Kornberg and his team had solved the structure of RNA polymerase. It was a huge achievement and one that was recognized with, yup, another Nobel Prize.

Roger Kornberg: “So at the time when we solved that structure, which was 20 years ago, it was by far the largest and most challenging investigated by X-ray diffraction.”

RNA polymerase is arguably the most important protein in biology. It was a challenge because it’s not a single protein. The team studied RNA polymerase from yeast, which is actually made up of 12 proteins. What’s more, it’s a molecular machine with moving parts.

Roger Kornberg: “The RNA polymerase literally reads the genetic information. So it is responsible for the capacity of what information is stored in the genome in DNA to direct the activities of every living thing. There is no organism as simple as a virus or complicated as a human that doesn’t rely on an RNA polymerase for life.”

To solve the structure of RNA polymerase, Kornberg and his team spent years working on the right kind of crystals and “heavy” atoms for their protein. But that wasn’t enough. They also needed more intense beams of X-rays.

Roger Kornberg: “The method of X-ray diffraction relies upon scattering of the X-ray photons from the individual atoms in the structure — and the greater the number of atoms, the larger the number of scattered photons that must be recorded for the purpose. If the beam is of low intensity, there are not many photons and so insufficient information is obtained. With a beam of higher intensity, more atoms can be detected and recorded.”

The solution came from synchrotrons. Synchrotrons are particle accelerators that propel beams of electrons at high speed — and the high-speed electrons emit X-rays that are millions of times brighter than conventional X-rays. It’s essentially a more powerful and much larger version of the Crookes tube that Röntgen was using when he discovered X-rays.

The combination of high intensity X-rays from synchrotrons and increasing computer power enabled scientists like Kornberg to solve more complex protein structures.

When I was working at the journal Nature from 2007 to 2019, we used to joke about the number of structural-biology papers: there seemed to be a new, important protein structure published every week.

But there were limitations. X-ray crystallography was still time-consuming, although not as much as in the early days. And some types of protein proved hard or impossible to crystallize.

At the turn of the century, a new technique came into view. Or, rather, a new technique gave scientists a new view of proteins. Instead of using X-rays, the technique uses beams of electrons. It’s called cryo-EM. Cryo, because the protein sample is frozen. EM for electron microscopy. Richard Henderson was one of the first to use it.

Richard Henderson: “When you irradiate anything, whether it’s with X-rays or electrons, in addition to giving you a beautiful image, you are actually damaging the molecules, and after a certain exposure the molecule has lost its structure, so you’re limited in the amount of information you can get before you have to stop, because you’ve killed your sample. And it turns out that for the same amount of information that’s useful, the electrons do about a thousand times less damage than X-rays.”

For cryo-EM, the protein doesn’t need to be a crystal. Instead, it is isolated from the cell and then frozen to liquid nitrogen temperature or below. The freezing helps to protect the protein from radiation damage.

Henderson applied the technique to proteins embedded in cell membranes. These large protein complexes had proved extremely hard to study by X-ray crystallography. Cryo-EM became extremely popular. In the 2000s, scientists talked about a “cryo-EM revolution” and many switched from X-ray crystallography to the new, faster technique. In 2017, Richard Henderson was awarded a Nobel Prize.

Like X-ray crystallography, cryo-EM became a more powerful tool as computing power increased, enabling more data to be analyzed more quickly. Roger Kornberg again:

Roger Kornberg: “One cannot underestimate the contribution made by the extraordinary advance in computing power. To put it in perspective, in respect to RNA polymerase, when we recorded the X-ray diffraction from RNA polymerase at the end of 1999 to solve the structure, it required more than a month of computation on advanced computers made available to us not commercially available, contributed by the manufacturers. Today, that same computation could be performed in a few minutes on a laptop computer.”

Computers have been key to the successes of both X-ray crystallography and cryo-EM. Can we now do away with these experimental techniques all together and just use computing power to predict the structure of proteins? Remember the challenge set by Christian Anfinsen in his Nobel lecture?

… to make more realistic the idea of the a priori prediction of protein conformation.

To predict how a string of amino acids will fold up, scientists use a concept called “free energy.” Free energy makes a protein unstable. The idea is that the amino acids will fold up in such a way as to minimize the amount of free energy. Richard Henderson:

Richard Henderson: “You can do structures by energy minimization up to about 60 or 70 amino acids. So David Baker’s group in Seattle in the USA has been particularly strong in doing that. But once you are up to proteins of 1,000 or so, it gets rapidly out of reach.”

So the technique works for figuring out a small section of a protein — perhaps a significant side chain. But for whole proteins with hundreds or thousands of amino acids, scientists use a different approach. Instead of asking the computer to figure out the structure from first principles, they train an algorithm using a database of known protein structures. This is what Google’s AI lab did recently, when their protein prediction algorithm, AlphaFold, outperformed all others at a competition in 2020.

Roger Kornberg: “The basis for it really comes from the long history of protein crystallography and its great success and the extraordinary number of structures that have been solved and deposited in the protein database. What is probably different about AlphaFold is the amount of AI expertise they could bring to bear in the corporate context, which goes so far beyond what any individual academic investigator can do, the power of the computation which they possess which is extraordinarily distributed over countless extraordinarily expensive computational centers around the globe. In a way they, contributed little beyond bringing the resources that they possess to bear on what was a well-studied and in retrospect solved problem.”

Kornberg certainly recognizes the potential of protein prediction programs like AlphaFold to predict the structures of a very large number of proteins, including ones that have not been solved before.

Roger Kornberg: “And if the number is great enough, then the impact upon life science and biology in particular is profound.”

Understanding the structure of proteins is enlightening and satisfying in itself, but it also enables us to design better drugs, as has been shown in the recent efforts to deal with Covid. Enzymes called proteases help viruses, including coronaviruses, to replicate. So they’ve been an obvious target for drugs.

Roger Kornberg: “The drugs directed against the protease have already been refined using X-ray diffraction, much improved by observing the drug associated with its target and then seeing how one might improve the structure of the drug to gain better effect upon the target.”

X-ray crystallography and cryo-EM have been so successful that Richard Henderson thinks we’re close to solving the structure of every protein.

Richard Henderson: “We basically have, experimentally, determined the structure of almost all the proteins — it may be half of them, it may be three-quarters of them. And if not the protein you’re interested in — for example, a drug targeting a virus — there’ll be some homologous structure.”

Will the combination of experimental techniques and AI be so successful that it will put structural biologists out of a job? Henderson remembers that years ago scientists had long lists of proteins whose structures they wanted to solve.

Richard Henderson: “I remember when we were younger, in meetings, everybody would be working on one protein, then they would say ‘What shall we work on next?’ And everybody would have their favorite list. I remember mine, we had ribosomes, actin, myosin, ATPases, redoxin, bacteriocin, all of these structures solved decades ago now. And so now if you ask people what structure, they’ll tell you the one they’re working on, but they don’t have a big list left anymore.”

Now that they’ve ticked most proteins off their lists, what will be left for structural biologists to do?

Richard Henderson: “Once you know the structure of everything and you’ve got a drug that’s an activator or inhibitor, after that you can always — this is obviously contentious discussion, but — after that you can always invent things. There is this trajectory from what you could call discovery science to invention science, where you patent something and develop a new compound, which could be a new protein.”

Henderson is talking about synthetic biology, a relatively new field in which scientists try to make new kinds of amino acids and proteins, to engineer the genetic code or to build simple cells from scratch.

There seems to be plenty of optimism among biologists.

Gone are the days where biomolecular scientists worked in isolation. Labs, teams, and nations are collaborating as never before to address pressing problems, from pollution to energy to pandemics.

These final words come from the Annual Review of Biophysics published in 2021.

With gene editing approaches, dazzling improvement in structural determination, and increasing reliability of computational predictions, scientists are well positioned to address many important problems in science, health, and industry.

If you enjoyed listening to this episode of the Knowable Podcast, do tell your friends, family and colleagues. We’d love to hear your feedback too. You can tweet us @KnowableMag, write to us — we’re [email protected], or leave us a review wherever you listen to podcasts. In this episode, I mentioned the Stanford synchrotron — that was the source of more intense beams of X-rays. If that piqued your interest, listen out for a future episode on the history of particle accelerators including the Stanford synchrotron and the Large Hadron Collider at CERN.

In this episode you heard from Richard Henderson and Roger Kornberg. The episode featured quotes from the following articles published by Annual Reviews. They are: Sponsler and Dore, 1936; Astbury, 1939; Kraut, 1965; and Schlick et al., 2021. You can find links to those papers and others mentioned in this podcast in the show notes on our website:

This podcast was produced by Knowable Magazine, a nonprofit publication that seeks to make scientific knowledge accessible to all. Knowable Magazine is an editorially independent initiative from Annual Reviews. Go to to explore more smart science stories.

I’m Charlotte Stoddart and this has been Knowable.