Precise definitions in science are overrated

Enhancers: We study them, catalog them, design them, but we don't know what they are.

Sep 27, 2024

Scientists spend a lot of time studying things that they can’t define precisely, and in whose reality they may not believe. This approach to learning about the world has been incredibly successful.

Take genes, for example. The word “gene” was coined in 1909 by the Danish Botanist Wilhelm Johannsen, who was looking for a convenient word to capture the newly revived Mendelian idea that traits are inherited independently of one another. Johannsen coined a new word because he wanted some completely new term that was “completely free from any hypotheses.” For decades afterwards, many biologists doubted that the term referred to a real, physical entity. In 1923, biologist Edmund Wilson wrote that genes were still widely viewed as “a convenient fiction or algebraic symbolism,” despite the fact that they were a subject of intense study by the rapidly developing field of genetics. Pursuing this convenient fiction resulted in some of the biggest advances in the history of biology.

Another subject of study that biologists have a hard time defining is the “enhancer”. These crucial regulatory DNA elements take up more genomic real estate than protein-coding genes and most disease-associated genetic variants seem to occur in them, but enhancers are poorly defined and their exact physical characteristics are ambiguous. Enhancers control the expression of genes in time and space, and are thus the critical DNA elements that make it possible for organisms to be made up of thousands of different cell types all under the control of a single genome. Enhancers were first described in 1981 with an operational definition: a segment of DNA with the ability to enhance gene transcription regardless of its position or orientation relative the gene. The definition was (and still is) based on the very artificial context of reporter assay measurements, in which a segment of DNA is taken out of its native position in the chromosome. It’s like defining a rock as a thing that is hard, heavy, and that falls when you drop it. The question we really want to know about enhancers is, what are they, physically?

A super-enhancer in contact with a bursting gene. Figure 1D from Du, et al., Cell 2024. 187:331-334.e17 under CC BY-NC4.0

The regulatory function of enhancers has been a major focus of genomic research for decades, but biologists can’t tell you exactly what an enhancer is, or even whether it is one particular type of thing. In a 2020 article, three scientists wrote about making a complete catalog of human enhancers. But they noted that there are conflicting definitions of what an enhancer is:

In the current parlance of the field, the term ‘enhancer’ is often used interchangeably to refer to: first, DNA sequence elements that meet the original Banerji et al.⁷ (1981) definition —that is, enhancing transcription in a reporter assay; second, DNA sequence elements that bear biochemical marks associated with enhancer activity; or third, endogenous, distally located DNA sequence elements that serve to enhance the transcription of a cis-located gene, in vivo and in their native genomic context. But these definitions are not equivalent. There may be sequences that activate transcription in the context of a reporter assay but do not meaningfully do so in vivo. There may also be sequences that bear enhancer-associated biochemical marks but do not actually function as enhancers in vivo. Finally, there may be in vivo enhancers that are non-canonically marked or that have contextual dependencies that are not maintained in a reporter assay.

How can you completely catalog something that you can’t define? It may seem non-sensical to try, but this is how scientific progress is made.

In a review published earlier this year, MIT biologists Jin Yang and Anders Hansen wrote that:

Despite recent efforts to define enhancers by their physical properties, such as occupancy of transcription factors or co-activators, presence of certain histone post-translational modifications (PTMs), depletion of nucleosomes or transcription of enhancer RNAs¹¹, unambiguous criteria are yet to be established.

Physically, we don’t know what enhancers are or how they work. They are segments of DNA that are physically bound by specific DNA-binding transcription factor proteins, but plenty of transcription factor-bound DNA segments don’t function as enhancers. We don’t understand what differentiates the transcription factor-bound DNA segments that are functional from those that are not. Enhancers often act on their target genes over long chromosomal distances, sometimes skipping over closer genes to reach more distant ones, but we don’t know what enables them to do that. We don’t know what physically happens when enhancers activate their target genes. Part of the mystery is that enhancers don’t seem to get close enough to their targets to physically contact them via well-defined protein complexes. Instead, contact seems to be mediated by large, amorphous “condensates”, another type of biological entity that isn’t well defined. (See the figure above.) We don’t know if enhancers have strict boundaries or if their biochemical activity just gradually attenuates as you move away from the most densely bound region. We don’t know if the term ‘enhancer’ even refers to a single type of thing, or whether biophysically distinct types of entities have been generated over evolution. Different enhancers are bound by different sets of co-factors, but we have no idea if this means that their mechanisms of actions or in fact different. And speaking of evolution, since we don’t understand how they work physically, we don’t understand how enhancers evolve. Their function is critical, but their sequence is much, much less well conserved than that of protein-coding genes.

Despite all of the ambiguity, scientists are counting, cataloging, testing, and designing enhancers. We’re still making a lot of progress working on genomic entities that we can’t precisely define. But, just as scientists eventually figured out the physical makeup of a gene, at some point the field is going to need to work out the physical constituents of enhancers. It’s not an easy task. As Marc Halfon wrote in 2020, “This fundamental question of how to define an enhancer is not one that is readily resolved. In the meantime, however, it creates significant opportunities for ambiguities, contradictions, and interpretive confusion in the literature.” The confusion is so bad, that Halfon wondered whether the whole field is just a flimsy pile of inconsistent results:

Given the biases in the enhancer literature I have illustrated here, and the circularity in enhancer feature characterization, the reader might wonder if our current understanding of enhancers is just a house of cards, propped up by the thinnest evidence and ready to topple with the next clear experiment.

But “such a view would be a mistake,” Halfon goes on to say. Despite the confusion, the field has produced some important studies that have uncovered critical, but not unambiguous, features of enhancers.

In other words, scientists still make progress when they study things that they can’t yet define very well. But until we can answer some of the key unanswered questions about enhancers, it will remain difficult to predict their effects in health and disease — such as when a mutation in an enhancer increases the risk of cancer and when it doesn’t.

Discussion about this post

Ready for more?