Seeking the origin story of de novo genes
In June 2000—around the same time that Li Zhao was ending her junior year of high school in Liaocheng, a city in northeast China—the Human Genome Project unveiled a working draft of the human genome. China, which had joined the project a year earlier, had contributed by sequencing one percent “Not a lot,” says Zhao, head of Rockefeller’s Laboratory of Evolutionary Genetics and Genomics, but “enough to make the news.”
It was enough to set the course of her career, as well.
“I was always very excited about trying to understand why individuals are different, why species are different,” Zhao recollects from her office overlooking the East River. Before learning about the Human Genome Project, however, she had never given much thought to how genes influence the vast range of observable traits, or phenotypes, seen in nature. Today, she thinks about such questions almost constantly—and in looking for answers, she has changed our understanding of how genes arise, spread, and evolve, shaping individuals and entire species.
Zhao’s particular object of fascination are so-called de novo genes, which emerge from previously silent or non-coding stretches of DNA, like Athena springing fully formed from the forehead of Zeus. They were once considered as rare as unicorns; but over the past decade, Zhao and her colleagues have established that de novo genes are actually quite common, identifying more than 500 genetic newbies in Drosophila alone.
Her studies have shown that these unique genes represent an important source of evolutionary innovation, and that they may hold the key to solving a number of enduring mysteries, including why some similar genes appear in wildly different species, and how genetic novelty helps organisms adapt to their local environments.
Out of nowhere
As a child, Zhao’s principal contact with nature came through visits to her grandparents’ small family farm on the outskirts of Liaocheng. She vividly recalls the livestock, the fireflies that flitted through the evening sky, and the cicadas that she caught as they emerged buzzing from their underground burrows. “The sound from those insects is basically the background noise of my memories,” says Zhao, who wondered even then how nature could produce such wildly varied creatures.
That curiosity eventually led Zhao to Inner Mongolia University in northern China, where she began her lifelong study of the molecular processes that drive biological diversity. She also continued to examine that diversity firsthand, at one point taking a 1000-mile field trip to collect plant and insect samples from the deserts and forests of northern Inner Mongolia. She didn’t begin dealing in new genes, however, until she moved to Kunming to pursue a doctorate at the Chinese Academy of Sciences.
Evolution depends in no small part on the emergence of novel genes, and on the spread or removal of those genes through natural selection. Yet when Zhao entered graduate school in 2006, biologists believed that new genes invariably emerged from pre-existing ones. This might happen three different ways: through a process such as duplication, whereby an extra copy of a gene evolves into something novel; chimerism, whereby fragments of different genes are stitched together into new Franken-genes; or horizontal gene transfer, whereby genetic material leaps directly from one organism to another without the messy work of reproduction. The idea that wholly new genes could come from DNA sequences that had never before produced proteins was barely countenanced.
But when Zhao began her doctoral studies, a researcher named David Begun at the University of California, Davis, identified several de novo genes in the testes of Drosophila melanogaster, the common fruit fly.
“That wowed me,” says Zhao. “De novo gene origination had been thought impossible.”
She joined Begun’s lab as a postdoctoral fellow in 2011 and immediately began pushing the boundaries of what scientists knew about these suddenly not-so-impossible genes, probing their earliest stages of development and mapping how they evolved and spread across populations. Then as now, Zhao detected the faint signals of de novo genes and patterns of gene expression across different groups of fruit flies using RNA and DNA analysis, population genetics, data science, and animal experiments.
“We need large-scale data to generate hypotheses and identify candidate genes to study, but if you want to understand how genes contribute to biology, you need to go to the lab bench,” Zhao explains.
It was a joyful time: Not only was Zhao developing the methods she would later use to identify large numbers of de novo genes, but she was able to experience the natural diversity of California with her partner, Nicolas Svetec, a fellow postdoc who is now a senior research associate at Rockefeller. Together, the two hiked the coastal zones and redwood forests of Northern California, marveling at the results of the evolutionary processes they sought to understand in the lab. (Today she shares her delight in the natural world with their three-year-old son, whom they take to visit farms on Long Island.)
Gene hunting
Since coming to Rockefeller in 2017, Zhao’s hunt for de novo genes has resulted in hundreds of hits—a remarkable result when such detective work requires a significant amount of subtle sleuthing. Young ones are easier to spot than old ones, which may have spread so widely that their novelty is no longer evident. But even the youngest hardly advertise themselves.
Most de novo genes in Drosophila come from the fertile breeding grounds of the testis and are related to reproduction—a fact Zhao attributes to the importance of sexual selection among insects. But she is currently exploring whether similar numbers of de novo genes might be lurking in other tissues and responding to similar selective pressures—work that has led to the discovery of at least a few de novo genes in the brain and heart tissues of humans. While the importance of those findings remains to be seen, the presence of de novo genes throughout our own bodies could eventually shed light on questions of human health and disease.
In time, Zhao also aims to compile the first comprehensive description of de novo gene evolution within a species—namely, Drosophila. Tracing the emergence and development of de novo genes among different populations of fruit flies, which first emerged in Africa and spread across the globe with human beings, could help illuminate how adaptations are customized to specific environments, which is still not well understood, says Zhao. Might the same shared set of de novo genes help different populations of Drosophila adapt to new environments? Or does every species of fruit fly evolve its own unique collection of novel genes to suit its ecological niche?
“By comparing populations, we should be able to understand if certain genes are specific to certain populations, or are important for all populations,” Zhao says.
Zhao has also broadened her investigations to include the regulatory mechanisms that govern de novo genes and the novel proteins they produce. Biologists have long puzzled, for example, over how certain immune-related genes came to exist in organisms as distantly related as plants and humans. Zhao suspects they may have emerged independently as de novo genes, and she hopes to learn more about their origins and evolution by studying the structure and function of the proteins they manufacture.
Yet for all of the progress Zhao has made into revealing the prevalence of de novo genes and understanding the role they play in nature, the very earliest stage of their birth remains shrouded in mystery. But there are endless avenues for further investigation. “Gene birth and death are a continuous process in evolution,” Zhao says. “In any given population, and at any given time, there will be genes that are being born, and there will be genes that are being lost.”
And just as cosmologists look to the night sky to infer what conditions were like just after the Big Bang, Zhao uses her data to infer a possible origin story for de novo genes. It begins with the random joining of a non-coding sequence to a specialized bit of DNA that controls gene expression, and ends with a brand-new gene capable of producing a novel protein.
Confirming that narrative would represent a major step towards understanding how evolutionary innovation occurs at the molecular level. And like all of Zhao’s work, it would help explain how life on Earth came to be such a many-splendored thing.