The groundbreaking research marks a major milestone in genomic research, paving the way for further exploration and discovery across a wide range of species.
A group of researchers from over 20 nations have generated the most comprehensive primate sequencing dataset to date.
More than 800 genomes from 233 species around the world were sequenced, representing nearly half of all primate species and all 16 families of primates on the planet.
The unprecedented study provides insights into long-standing questions about how humans evolved from their primate ancestors, but it also brings powerful promises to treat human disease, advance the development of personalised medicine and boost conservation efforts for the world's most endangered species.
Sequencing the genome refers to the process of determining the precise order of the genetic “code” that makes up an organism's DNA. The code is made up of different letters (called nucleotides) that form a long sequence.
Sequencing the genome allows scientists to identify genes, mutations, and potential genetic factors related to traits, diseases, and other biological characteristics.
The team - led by Illumina, one of the global leaders in DNA sequencing technology, along with the Institute of Evolutionary Biology of Pompeu Fabra University in Barcelona, Spain, and the Bayer School of Medicine - came together with the goal to “catalogue primate genetic diversity as extensively as possible,” said Lukas Kuderna, who led the data generation of all the primate genome sequences at Illumina.
“And the reason for that is twofold,” he explains.
Primate genome to safeguard endangered species
Planet Earth is experiencing a profound biodiversity crisis, and the high-coverage whole-genome data from 233 primate species brings powerful insights to understand how the loss of biodiversity reflects on a genomic scale.
For conservation purposes, getting a genomic picture of surviving populations can be key “to turn around the trajectory of the species,” says Kuderna.
The extinction of a species can reflect at a genomic scale because when the number of individuals within a population decreases, some genetic variations may be lost due to random chance. In other words, the remaining individuals become increasingly genetically similar.
“And that's a very good indication that the genomic health of this species is not in very good shape, even though we don't have direct observations of them in nature,” Kuderna adds.
However, “if you have sufficient genetic variation within this endangered population, the initial impact of a diminished population size might be less severe for a species in which you have less variability”.
Primate genome to flag human disease
Another breakthrough the analysis brought to light is related to human genetics.
“We ourselves are primates, and most of these primate species are relatively closely related to us,” Kuderna told Euronews Next:
“The biological processes that are happening between us and other primate species are very, very similar”.
Scientists had already sequenced “millions or tens of millions of human genomes” to date. However, the understanding of most of these genetic variants, as well as their potential association with disease, was “very very limited,” says Kuderna.
“Humans are actually one of the least diverse primate species on earth. So the probability that we observe which of these mutations caused diseases by chance in humans is extremely small,” he notes.
The lack of diversity in humans means that when clinicians sequence the human genome and come across new variants, it is difficult to determine whether the mutation is rare because it's potentially harmful or rare because they haven't had the opportunity to observe it before, he said.
Why are humans not as genetically diverse?
The most significant contributor to genetic diversity is the long-term demographic history of a species, explains Kuderna. And for humans, this history is defined by a series of “bottlenecks”.
In this context, a bottleneck refers to a significant reduction in the size of a population, often caused by a drastic event, such as natural disasters, habitat loss, disease outbreaks, or human activities. During a bottleneck event, a large portion of the population is lost, resulting in a smaller group of individuals who survive and reproduce: resulting in a less diverse genetic makeup.
One of the most critical recent bottlenecks in human history was “when the species migrated out of Africa (which is where humans first emerged) to start inhabiting other parts of the world some 50 thousand years ago. As only a small number of individuals migrated, they carried only a subset of the diversity that was then inherited to subsequent generations,” says Kuderna.
Sequencing primates’ genomes to overcome the human diversity barrier is a powerful tool scientists have finally been able to look into, “humans may not be very diverse on a genomic level, but non-human primates are,” says Kuderna.
By analysing the much more diverse non-human primates' genomes at a massive scale, scientists can disregard genetic mutations that are not malignant, to help identify the dangerous ones in humans.
“We can classify millions of human mutations as benign by observing them as being common among other primate species, and then use this information to train PrimateAI-3D to predict the pathogenicity of all remaining mutations” explains Kuderna, adding that they are “very certain” that the benign mutations in non-human primates are also benign in humans.
After sequencing the genome from nearly half of all primate species, the team then fed their findings to an AI algorithm.
“You can think of it like ChatGPT, but on a genomic level in which you introduce as a query the variant in the human genome that you want to better understand. And it essentially gives you an interpretation of how harmful this variant will be,” says Kuderna.
“We can use the results of this experiment to predict which genetic mutations are potentially harmful,” he said.
Accurately discerning disease-causing from benign mutations and interpreting genetic variants on a genome-wide scale would constitute “a meaningful initial step towards realising the potential of personalised genomic medicine,” wrote the authors in a series of papers in the scientific journals Science and Science Advances.
“Our study addresses one of the key challenges in the variant interpretation field, namely, the lack of sufficient labelled data to effectively train large machine learning models”.
Why did it take so long to sequence a large sample of non-human primates' genomes?
Simply because the technology has now sufficiently advanced, says Kuderna.
"Genome sequencing has now become so accessible that it's possible to do these kinds of studies on a scale that has been absolutely unimaginable even a couple of years ago”.
The journey towards accessible sequencing has taken decades. Ten years ago, the cost for researchers to sequence a human genome was about 10 thousand euros. A few years ago, the figure fell to about a thousand. Today, the number sits around $600 (€557,89), according to Illumina.
And “there's so much more” that scientists will be able to do with “these kinds of datasets,” says Kuderna, adding that ”with no doubt,” they will be generated for a vast number of species in the near future.
The study has also forged a catalogue of differences between humans and other primate species, in other words, a unique set of mutations that have defined human beings in the process of evolution.
“This is not only a project to understand diseases in this context, but it's also one that aims to make a much clearer picture of what defines us as humans within the broader context of primates,” he notes.