DeepMind just uncovered the 3D structure of almost every protein known to science

A human protein structure
A human protein structure Copyright AFP PHOTO / EMBL-EBI / KAREN ARNOTT
By Luke Hurst
Share this articleComments
Share this articleClose Button

Grasping the complex shapes of proteins is key to understanding life and developing new medicines.


Researchers have deciphered the three-dimensional structures of almost every protein known to science, which could lead to major advances in a wide range of research areas including the treatment of diseases, sustainability and food insecurity.

The findings were announced on Thursday by artificial intelligence (AI) technology company DeepMind, which said the achievement could help quickly find new medicines and treatments and even “unlock the mysteries of how life itself works”.

Together with the European Molecular Biology Laboratory (EMBL-EBI), DeepMind has been working to uncover one of science’s mysteries - the 3D structure of proteins and how they interact with each other.

Using AI, the researchers had previously built a database of nearly one million protein structures.

Now, they have increased that 200-fold, predicting the structures of more than 200 million proteins, which covers almost every organism on Earth that has been genome sequenced.

They have also made the database publicly accessible, which they say could “dramatically increase our understanding of biology”.

Uncovering protein structures

Proteins are the building blocks of life, underpinning biological processes in every living thing.

There are around 200 million known proteins, all with different structures - and it is those structures that the organisations have been working to decipher.

Scientists have been trying to uncover these structures for decades, but there have been great strides forward in this work since DeepMind and EMBL-EBI first launched their database last year.

DeepMind - a subsidiary of Alphabet - said it started working on the challenge in 2016, creating an AI system called AlphaFold, which was taught how to predict the shape of proteins by showing it the structures of 100,000 known proteins.

Those structures took years and a heavy investment of resources to work out. Now with AlphaFold, DeepMind says it can uncover the structure of proteins in mere seconds.

“AlphaFold is the singular and momentous advance in life science that demonstrates the power of AI,” said Eric Topol, Founder and Director of the Scripps Research Translational Institute.

“AlphaFold has already accelerated and enabled massive discoveries, including cracking the structure of the nuclear pore complex. And with this new addition of structures illuminating nearly the entire protein universe, we can expect more biological mysteries to be solved each day”.

What impact has AlphaFold had?

There has already been an impact from the release of the database with the first million structures.

More than 1,000 scientific papers have cited the database already, and more than half a million researchers have accessed it in just over a year.

EMBL-EBI says it has also already shown impact in areas such as fighting plastic pollution, getting insight into Parkinson’s disease, boosting the health of honey bees, understanding how ice forms, and exploring human evolution.

“We released AlphaFold in the hopes that other teams could learn from and build on the advances we made, and it has been exciting to see that happen so quickly,” said John Jumper, Research Scientist and AlphaFold Lead at DeepMind.

“Many other AI research organisations have now entered the field and are building on AlphaFold’s advances to create further breakthroughs. This is truly a new era in structural biology, and AI-based methods are going to drive incredible progress,” he added.


Sameer Velankar, team leader at EMBL-EBI’s Protein Data Bank in Europe, said AlphaFold has “sent ripples through the molecular biology community”.

“In the past year alone, there have been over a thousand scientific articles on a broad range of research topics which use AlphaFold structures; I have never seen anything like it…And this is just the impact of one million predictions; imagine the impact of having over 200 million protein structure predictions openly accessible in the AlphaFold Database,” he said.

Share this articleComments

You might also like