Erwin Schrödinger and life at low copy number
Schrödinger's biological paradox, and how new technologies to measure protein binding to individual DNA molecules might address it.

One of the most profound questions that biology tries to answer is, how does the organized behavior of a living cell arise from the physical interactions of non-living components? The physicist Erwin Schrödinger posed this question in a particularly compelling way in a set of public lectures given in 1943, which were published the next year as the book What is Life? Writing about a decade before molecular biology emerged as a distinct field, Schrödinger basically said that physicists can’t explain the picture of life developed by geneticists:
Today, thanks to the ingenious work of biologists, mainly of geneticists, during the last thirty or forty years, enough is known about the actual material structure of organisms and about their functioning to state that, and to tell precisely why, present-day physics and chemistry could not possibly account for what happens in space and time within a living organism. What is Life, p. 4, emphasis added.
Life shouldn’t work but it does
Why couldn’t “present-day” (1943) physics and chemistry account for what happens within a living organism? Because the motion of individual atoms is dominated by random thermal motion, and this “does not allow the events that happen between a small number of atoms to enrol [sic] themselves according to any recognizable laws (p. 10).” The orderly behavior of matter, argues Schrödinger, is described by the physics of statistical mechanics and only holds for large collections of atoms. Thermodynamics, for example, gives a very accurate description of the pressure, volume, diffusion, and heat capacity of a gas at large n, but not for ~1000 atoms. And yet, “incredibly small groups of atoms, much too small to display exact statistical laws, do play a dominating role in the very orderly and lawful events within a living organism (p. 20).”
Statistical physics as come a long way since 1943, but the primary challenge to reconciling physics with the workings of a cell is still this paradox that Schrödinger identified: cells exhibit remarkably organized behavior that arises from the interactions of an astonishingly small number of molecules. The promoters that regulate the expression of critical genes exist in just two copies in each typical cell, and those promoters are bound by a handful of transcription factor proteins that activate gene transcription. The results for the organism are quite reliable. For example, 99.9% of all babies are born with five fingers on each hand, and this depends in part on the proper functioning of the ZRS enhancer, a regulatory DNA element that is present in only two copies of each cell and bound by a relatively small number of transcription factors. And not only do we need to consider that each cell operates with only two copies of each gene and its associated regulatory elements, but also that all of us start out life as only a single cell. Life works amazingly well at low copy number.
I won’t belabor the point, but I don’t think we have answered Schrödinger’s paradox (the biological one, not the cat one) in a satisfactory way. To be absolutely clear, I am not claiming that there is something happening in the cell that violates the laws of physics. The fact that we can assemble, in a test tube, operational cellular subsystems from synthetic components shows that organized behavior is an intrinsic feature of the interactions between these components. But we don’t have the quantitative models that would, I think, have satisfied Schrödinger. Sure, molecular biologists have identified millions of different types of functional molecules at work in the cell, and we are steadily mapping how they interact with each other. But from a physical perspective, I don’t think we have a great account of how a heterogenous collection of proteins, nucleic acids, and other molecules organize themselves into a coherent operation. Part of the answer, not recognized by Schrödinger but clearly appreciated by the 1950’s, is specificity. Specific, non-covalent interactions among biological molecules are a big part of the answer to Schrödinger’s question. But we don’t have accurate, generally applicable physical models that apply to the kinetics and thermodynamics of a cell. The growing interest in physics of biomolecular condensates is one area of the current frontier. (Here is another classic on this general topic, “Life at low Reynolds number” from 1976 (PDF)).
Life at low copy number
This is a long preamble to some recent work describing a set of exciting new technologies that enable us to more directly tackle the big question that Schrödinger posed: single-fiber assays. If a critical feature of life is regulating genes, which are present at two copies per cell, then we need to understand the binding events that are happening at those individual DNA molecules. So much of cell biological data comes from bulk assays, assays that measure averages over enormous numbers of cells. Single-fiber assays let biologists zoom in and see how many factors are bound on individual copies of regulatory DNA, but they do this at scale using sequencing technology, rather than imaging individual cells.
One of the papers I’m most excited about came out in November, work of the labs of Lacra Bintu and Will Greenleaf at Stanford. (Free bioRxiv version here.) They used one version of this technology to measure the binding states of individual regulatory DNA elements. In a nutshell, the method works by treating DNA with a methylating enzyme that marks sites that are accessible to the enzyme. If a transcription factor is bound to that site, the methylating enzyme can’t get there, and thus the site remains unmarked. By looking at the distribution of marked and unmarked sites on individual DNA molecules, you get binding statistics. For example, you can ask, if I have four DNA binding sites for a regulatory factor, how often are all four sites bound, versus fewer or no sites? How about if I increase the number of binding sites to six or eight?

What the experiment gives you is the frequencies of the different occupancy states of a DNA molecule. There is a direct analogy here to statistical thermodynamics: just like molecules in a gas occupy different energy states as described by the Boltzmann distribution, DNA molecules exist in different protein-bound or -unbound states with probabilities that can be described by a biophysical model. But unlike a gas, in this case we’re observing the microstates of the system directly.
With data like this, the Bintu and Greenleaf labs were able to parameterize biophysical models that describe how protein factors compete and cooperate for binding to regulatory DNA. Using this approach, they drew some conclusions about the relationship between transcription factor occupancy and gene expression and how binding of individual factors to individual sites works (binding seems to be independent). I won’t walk through the full paper here, but I do want to highlight some other interesting papers using related technologies. Andrew Stergachis, at the University of Washington, pioneered the technique called Fiber-seq with Stirling Churchman (at Harvard) and John Stamatoyannopoulos (University of Washington). Stergachis, who came and spoke to our department late last year, has a new preprint out describing DAF-seq, a more versatile form of Fiber-seq that lets you focus in on specific DNA elements of interest.
The DAF-seq preprint by the Stergachis lab has a number of very cool technical improvements over prior versions of this assay, part of which involved developing a very efficient new enzyme to mark the DNA. They also perform a binding analysis to show that the E-box of the NAPA promoter (element 1 in the figure below) nucleates the binding for elements 2 and 3. In other words, protein binding to these sites is not only cooperative, but directional, with one site driving the occupancy of the other sites.

The idea of using DNA-modifying enzymes to mark unbound sites of DNA has been around for awhile (see here, here, here, and here for a few of the important papers). As the technology becomes democratized, we’re going to see more work directly addressing the mechanisms of gene regulation, which get to the heart of Schrödinger’s biological paradox: how a handful of binding events at a single DNA molecule give rise to reproducible, orderly behavior in the face of disorderly, random thermal motion.