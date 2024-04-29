



3D model of the CRISPR-Cas9 gene editing complex from Streptococcus pyogenes. Credit: Indigo Molecular Images/Science Photo Library

In the never-ending quest to discover previously unknown CRISPR gene-editing systems, researchers have investigated microbes in everything from hot springs and peat bogs to poop and even yogurt. I did. Now, thanks to advances in generative artificial intelligence (AI), we may be able to design these systems at the push of a button.

This week, researchers described how they use a generative AI tool called a protein language model, a neural network trained on millions of protein sequences, to design CRISPR gene-edited proteins, and how some of these systems The department was able to demonstrate how it works as expected and presented details of how it worked. In the laboratory 1.

And in February, another team announced they had developed a model trained on microbial genomes and used it to design a new CRISPR system. The system consists of a DNA or RNA-cutting enzyme and an RNA molecule that tells the molecular scissors where to cut. Cut 2.

That really just scratches the surface. This shows that machine learning models can be used to design these complex systems, says machine learning scientist and CEO of Berkeley, Calif.-based biotechnology company Profluent. said one Ali Madani. In his April 22 preprint 1 (not peer-reviewed) on bioRxiv.org, Madanis' team reported the first successful editing of the human genome with proteins designed entirely with machine learning.

Alan Wong, a synthetic biologist at the University of Hong Kong, whose team is using machine learning to optimize CRISPR3, said naturally occurring gene editing systems lack the range of sequences and types of changes they can target. It is said that there are limits in this respect. make. Therefore, finding a suitable CRISPR can be difficult for some applications. Leveraging AI to expand an editor's repertoire could help, he says.

trained in genomics

While chatbots such as ChatGPT are designed to process language after being trained on existing text, CRISPR-designed AI is instead trained on vast amounts of biological data in the form of proteins or genome sequences. Ta. The purpose of this pre-training step is to incorporate into the model insights into naturally occurring gene sequences, such as which amino acids tend to bind together. This information can be applied to tasks such as creating entirely new sequences.

Madanis' team previously devised new antimicrobial proteins using a protein language model they developed called ProGen. To devise new CRISPR, his team used examples of the millions of diverse CRISPR systems used by bacteria and other single-celled microorganisms called archaea to defend against viruses, using ProGen's latest Retrained version.

Because the CRISPR gene editing system includes not only proteins but also RNA molecules that specify their targets, the Madanis team developed a separate AI model to design these guide RNAs.

The team then used neural networks to design millions of new CRISPR protein sequences belonging to dozens of different families of such proteins that occur in nature. To see if the AI-designed CRISPR is a bona fide gene editor, Madanis' team will respond to his more than 200 protein designs belonging to the CRISPRCas9 system, which is currently widely used in labs. I synthesized the DNA sequence. By inserting these sequence instructions for the Cas9 protein and guide RNA into human cells, many gene editors were able to precisely cut their intended targets within the genome.

The most promising Cas9 protein molecule, which they named OpenCRISPR-1, was as efficient at cutting target DNA sequences as the widely used bacterial CRISPRCas9 enzyme, and far less likely to cut in the wrong places. Researchers also used OpenCRISPR-1's design to create a base editor, a precision gene-editing tool that changes individual DNA letters, but like other base-editing systems, it is also efficient. I also discovered that it is less likely to cause errors.

Another team, led by Brian Hie, a computational biologist at Stanford University in California, and Patrick Hsu, a bioengineer at Ark Research Institute in Palo Alto, California, used an AI model that can generate both protein and RNA sequences. did. Their model, called EVO, was trained on 80,000 genomes, or 300 billion DNA letters, from sequences of bacteria, archaea, and other microorganisms. Hie and the Hsus team have not yet tested their design in the lab. However, the predicted structures of some of the CRISPRCas9 systems they designed resemble those of natural proteins. Their work is described in preprint 2 posted to bioRxiv.org and has not been peer-reviewed.

precision medicine

This is surprising, says Noelia Fels Capapei, a computational biologist at the Barcelona Institute of Molecular Biology in Spain. She was impressed by the fact that researchers could use her OpenCRISPR-1 molecule without restrictions, unlike some patented gene editing tools. Her atlas of the ProGen2 model and the CRISPR sequences used to fine-tune it is also available for free.

Madani says he is hopeful that AI-designed gene-editing tools may be better suited for medical applications than existing CRISPR. He added that Profluent is looking to partner with companies developing gene-editing therapies to test AI-generated CRISPR. It really requires precision and bespoke design. And I don't think that's possible just by copying and pasting from his naturally occurring CRISPR system, he says.

