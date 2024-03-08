



New deep learning algorithms can predict the structure of proteins bound to a variety of other molecules, such as drugs, fluorescent dyes, and metals. The work allows researchers to design proteins that bind to specific molecules, potentially useful in areas such as enzyme catalysis.

Because proteins are made up of thousands of atoms, it is computationally infeasible to solve their structures using energy minimization models like density functional theory. Deep learning algorithms such as AlphaFold (developed by Google DeepMind) and RoseTTAFold (by the Protein Design Institute at the University of Washington) examine known structures in international protein data banks to learn how to solve unknown structures. It has proven to be very useful. All of these deep learning methods work in terms of probability, not energy, said computational biologist David Baker, director of the Protein Design Institute.

One complicating factor is that proteins in nature do not exist as isolated chains of amino acids. Some proteins do not fold unless they bind to metals or other types of so-called cofactors, which may be small molecules, says Nicholas Polizzi of Harvard Medical School in Massachusetts. Ligands are often implied through AlphaFold or RoseTTAFold, as AlphaFold or RoseTTAFold is trained on proteins that have been crystallized, have ligands bound, and have their structures solved. However, this prevents researchers from understanding the effects of ligands or designing proteins that bind to specific ligands.

In the new study, Baker and colleagues developed a modified version of RoseTTAFold called RoseTTAFold All-Atom that allows them to combine the amino acid chain structure of proteins with atom-based representations of small molecule ligands. They obtained data on small molecule structures of proteins, metal complexes of proteins, and proteins with covalently bonded amino acids from the protein structure data bank and used this to train algorithms and develop general predictive capabilities. I made it possible. It predicted recently solved structures that were not included in training with a high level of accuracy. The researchers also used this model to design and ly synthesize proteins that bind to his three common ligands: the enzyme cofactor heme, the heart disease drug digoxigenin, and the light-harvesting molecule villin. did.

The researchers are looking at several applications, including designing small molecule drugs and sensors. Baker says he's very interested in using this to design catalysts, allowing him to model the transition states of chemical reactions and design proteins to stabilize them.

Polizzi, who was not involved in the project, described it as a milestone. However, he cautions that one of the hurdles to improving the method lies in obtaining more data to feed the algorithm. He points out that while protein sequences have billions of data points, so much progress has been made in protein structure prediction, comparable data currently does not exist for small molecules.

Polizzi said the latest version of RoseTTAFold appears to be well-suited to predicting the structure of proteins with bound molecules. [already] We know that molecules bind to proteins. But he points out that many people will want to test whether a molecule binds to a protein. I personally plan to try it, he says, but I'm not going to blindly trust the results.

