AI plus gene editing promises to accelerate biotechnology

During her 2018 Nobel Prize in Chemistry lecture, Frances Arnold said, “Today, for all practical purposes, we can read, write, and edit any DNA sequence, but we cannot assemble it.” That is no longer true.

Since then, science and technology have advanced so much that artificial intelligence has learned to synthesize DNA, and with genetically modified bacteria, scientists are on their way to designing and creating customized proteins.

The goal is that scientists, with the design talents of AI and the technical capabilities of gene editing, can modify bacteria to act as mini-factories that produce new proteins that can reduce greenhouse gases, digest plastics or act as species-specific pesticides.

As a chemistry professor and computational chemist who studies molecular sciences and environmental chemistry, I believe advances in AI and gene editing make this a realistic possibility.

Gene sequencing – reading the recipes of life

All living things contain genetic materials – DNA and RNA – that provide the hereditary information needed to reproduce themselves and make proteins. Proteins make up 75% of the dry weight of humans. They form muscles, enzymes, hormones, blood, hair and cartilage. Understanding proteins means understanding a large part of biology. The order of nucleotide bases in DNA, or RNA in some viruses, encodes this information, and genomic sequencing technologies identify the order of these bases.

The Human Genome Project was an international effort that mapped the entire human genome from 1990 to 2003. Thanks to rapidly improving technologies, it took seven years to sequence the first 1% of the genome and another seven years for the remaining 99%. In 2003, scientists had the complete sequence of the 3 billion nucleotide base pairs that coded for 20,000 to 25,000 genes in the human genome.

However, understanding the functions of most proteins and correcting their malfunctions remained a challenge.

AI learns proteins

The shape of any protein is crucial to its function and is determined by the sequence of its amino acids, which in turn is determined by the nucleotide sequence of the gene. Misfolded proteins have the wrong shape and can cause diseases such as neurodegenerative diseases, cystic fibrosis and type 2 diabetes. Understanding these diseases and developing treatments requires knowledge of protein shapes.

Before 2016, the only way to determine the shape of a protein was through X-ray crystallography, a laboratory technique that uses the diffraction of X-rays by single crystals to determine the precise arrangement of atoms and molecules in three dimensions in a molecule. At the time, the structures of about 200,000 proteins had been determined by crystallography, which cost billions of dollars.

AlphaFold, a machine learning program, used these crystal structures as a training set to determine the shapes of the proteins based on their nucleotide sequences. And in less than a year, the program calculated the protein structures of all 214 million genes that had been sequenced and published. The protein structures determined by AlphaFold have all been released in a freely available database.

To effectively tackle non-infectious diseases and design new drugs, scientists need more detailed knowledge of how proteins, especially enzymes, bind small molecules. Enzymes are protein catalysts that enable and regulate biochemical reactions.

AlphaFold3, released on May 8, 2024, can predict protein shapes and the locations where small molecules can bind to these proteins. In rational drug design, drugs are designed to bind proteins involved in a pathway related to the disease being treated. The small molecule drugs bind to the protein binding site and modulate its activity, influencing the disease pathway. By being able to predict protein binding sites, AlphaFold3 will expand researchers’ drug development capabilities.

AI + CRISPR = synthesize new proteins

Around 2015, the development of CRISPR technology revolutionized gene editing. CRISPR can be used to find a specific part of a gene, change or delete it, make the cell express more or less of its gene product, or even insert a completely foreign gene in its place.

In 2020, Jennifer Doudna and Emmanuelle Charpentier received the Nobel Prize in Chemistry “for the development of a method (CRISPR) for genome editing.” With CRISPR, gene editing that once took years and was species-specific, expensive and labor-intensive can now be done in days and at a fraction of the cost.

AI and genetic engineering are developing rapidly. What was once complicated and expensive is now routine. Looking ahead, the dream is of tailor-made proteins designed and produced by a combination of machine learning and CRISPR-modified bacteria. AI would design the proteins, and bacteria altered using CRISPR would produce the proteins. Enzymes produced in this way could potentially inhale carbon dioxide and methane while exhaling organic raw materials, or break down plastics into replacements for concrete.

I believe these ambitions are not unrealistic, considering that genetically modified organisms already account for 2% of the US economy in the agricultural and pharmaceutical sectors.

Two groups have created functioning enzymes from scratch designed by different AI systems. David Baker’s Institute for Protein Design at the University of Washington came up with a new deep learning-based protein design strategy called “family-wide hallucination,” which they used to create a unique light-emitting enzyme. Meanwhile, biotech startup Profluent has used an AI trained on the sum of all CRISPR-Cas knowledge to design new functioning genome editors.

If AI can learn to create new CRISPR systems, as well as bioluminescent enzymes that work and have never been seen on Earth, there is hope that the combination of CRISPR with AI could be used to design other new custom enzymes. Although the CRISPR-AI combination is still in its infancy, once it matures it will likely be very useful and could even help the world tackle climate change.

However, it is important to remember that the more powerful a technology is, the greater the risks it poses. Furthermore, humans have not been very successful at manipulating nature due to the complexity and interconnectedness of natural systems, which often leads to unintended consequences.

This article is republished from The Conversation, an independent nonprofit organization providing facts and analysis to help you understand our complex world.

It was written by: Marc Zimmer, Connecticut College.

Read more:

Marc Zimmer does not work for, consult with, own stock in, or receive funding from any company or organization that would benefit from this article, and has disclosed no relevant affiliations beyond their academic appointment.

Leave a Comment