Tuesday, September 27

DeepMind’s AlphaFold limitations detailed in MIT study • The Register

Analysis DeepMind’s AlphaFold model has predicted almost every known protein structure discovered to date, although its ability to help scientists discover new drugs remains unproven.

Proteins are complex molecules created by organisms to provide the biological functions necessary for life. Typically made up of a string of 20 amino acids, these chains fold in myriad ways, with their final shape determining how they function and how they interact with other elements.

It is not a simple process to determine how a protein folds. For example, let’s say you wanted to synthesize a protein or slightly modify how it works. You can’t adjust its amino acids or create a new chain of them and know for sure how they will transform and function when folded. This is where computers come in.

Advances in AI algorithms and training have led to the development of software, such as AlphaFold, that can accurately predict the 3D shapes of proteins given their amino acid combinations.

AlphaFold is impressive and has now predicted over 200 million proteins from their amino acid chains. The researchers hoped that building such a large database would allow scientists to develop treatments targeting specific proteins associated with diseases such as cancer or dementia. To find such drugs, you may need to know the physical structure of the protein, where programs like AlphaFold can be used.

A survey by MIT academics in America, however, shows how difficult the task is in practice. Essentially, AI software is helpful at one stage of the process – structure prediction – but cannot help with other stages, such as modeling how drugs and proteins would physically interact.

“Breakthroughs like AlphaFold expand the possibilities of silicone (computer simulation) of drug discovery efforts, but these developments need to be coupled with further advances in other aspects of modeling that are part of drug discovery efforts,” James Collins, lead author of the published study in Molecular Systems Biology and professor of bioengineering at MIT, said in a statement.

“Our study speaks to both the current capabilities and the current limitations of computational platforms for drug discovery.”

Collins and his colleagues used AlphaFold to simulate interactions between bacterial proteins and antibacterial compounds, a task known as molecular docking. The goal was to use molecular docking to rank candidate compounds based on their binding strength to the target protein. A molecule that binds strongly to a protein is more likely to be an effective drug; it might be more effective in preventing the protein from performing a pathogenic function, such as tumor growth, for example.

The team tested the ability of AlphaFold to model the interactions between 296 essential proteins of E.coli bacteria with 218 antibacterial compounds, including antibiotics such as tetracyclines. AlphaFold was not very good at accurately modeling molecular docking simulations.

“Using these standard molecular docking simulations, we got an auROC value of about 0.5, which basically means you’re not doing any better than guessing at random,” Collins said.

Not the smartest AI on the block

Other machine learning models were more accurate than AlphaFold for some simulations, according to Felix Wong, co-author of the paper and postdoctoral researcher at MIT.

“Machine learning models not only learn the shapes, but also the chemical and physical properties of known interactions, then use that information to reassess docking predictions,” he said. “We found that if you were to filter interactions using these additional patterns, you can get a higher ratio of true positives to false positives.”

Derek Lowe, a longtime drug discovery chemist and science writer, said The register he wasn’t surprised by the results given that AlphaFold wasn’t really trained in molecular docking simulations. “Docking small molecules in a given protein structure is really a different problem than determining that protein structure in the first place,” he said.

Being able to model these types of chemical interactions is an unsolved problem. No algorithm is perfect. Even though scientists have a good model of the protein, its shape changes when it interacts with a potential drug candidate in mysterious ways.

“Virtual screening has never yet reached the level of ‘works every time’ – sometimes it provides useful information and sometimes it doesn’t, and you never know in advance which regimen you’re working in. On top of that, it There’s the way that different docking software will give you different answers, and for a given target, one of them might give significantly more useful answers than another – but again, you don’t know not in advance which of those will be,” Lowe said.

“Even with perfect protein structures, some of them will be better suited to a docking and scoring approach than others, and the AlphaFold structures, while impressive, aren’t perfect either. But for me, it’s not so much about AlphaFold as it is about hosting technology.”

AlphaFold may prove useful for other parts of the drug discovery pipeline, where comparing protein structures obtained via different methods against model predictions is valuable.

“The biggest problems in drug discovery are the ones that contribute to our approximately 85% failure rate in the clinic. And those are picking the right targets and getting early warnings about toxicity. None of these are only helped a lot by knowledge of protein structures,” added Lowe. ®