Revolutionizing Protein Engineering: Unleashing the Power of Artificial Intelligence

TL;DR:

  • Proteins are essential in molecular biology, carrying out diverse functions in the body.
  • Artificial intelligence (AI) diffusion models, like ProteinSGM, are now being used to synthesize proteins.
  • ProteinSGM is a score-based generative model that designs new proteins by emulating characteristics of original training data.
  • The Kim Lab at the University of Toronto is at the forefront of protein engineering research.
  • ProteinSGM has the potential to design novel proteins with specific binding targets, including therapeutic antibodies.
  • AI-designed proteins structurally resemble their natural counterparts and offer greater diversity in structures.
  • ProteinSGM aims to overcome challenges in modeling antibody structures, enabling the design of therapeutic antibodies.
  • Antibody-based therapies can be used to treat various conditions, and ProteinSGM shows promise in enhancing their effectiveness.
  • AI diffusion models like ProteinSGM have a significantly higher success rate in designing functional antibodies compared to previous methods.

Main AI News:

In the realm of molecular biology, proteins hold immense significance. Encoded by DNA, these complex macromolecules are composed of amino acid sequences that ultimately fold into intricate three-dimensional structures. These structures enable proteins to perform a vast array of functions, ranging from metabolizing the food we consume to triggering immune responses and facilitating gene synthesis.

The remarkable diversity observed in proteins can be attributed to the forces of evolution, as natural selection optimizes their function and structure over time. However, as we enter the year 2023, a new player is making waves in the field of protein generation: artificial intelligence (AI) diffusion models. One such groundbreaking model is DALL-E, an AI-based system renowned for its ability to generate lifelike images from textual descriptions. Now, these powerful AI models are venturing into the realm of protein synthesis.

At the forefront of this innovative research is the Kim Lab at the University of Toronto, spearheading the development of ProteinSGM—a score-based generative model (SGM) capable of producing entirely new proteins. The Varsity had the privilege of speaking with Michael Lee, the mastermind behind ProteinSGM, and his supervisor, Philip Kim, a distinguished professor at U of T’s Donnelly Center for Cellular and Biomolecular Research. Kim shed light on how Lee ingeniously devised a method to design protein structures inspired by the AI-driven generation of images.

ProteinSGM operates on a foundation of assigning scores to various training data samples, which serve as real-life examples for the model to emulate. These scores reflect the extent to which the AI-generated samples align with the defining characteristics of the original training data. In the case of ProteinSGM, the training data comprises representational images of protein structures that are fed into the system.

Prior to the advent of SGMs, protein engineering progressed at a glacial pace, with incremental advancements pale in comparison to the exponential strides witnessed in the past five years, thanks to SGMs. Presently, the Kim Lab is fully immersed in the design of proteins, harnessing the recent breakthroughs from the machine learning community.

While Kim is driven by the grand vision of translating AI-designed therapeutic antibodies from theoretical models into tangible clinical applications, he acknowledges his inner scientist’s yearning to delve into the fundamentals. Understanding the intricacies of protein structures sanctioned by nature through modeling these structures for further study remains a paramount goal for him.

Lee’s fascination with the potential of AI-powered computational models in generative biology led him to embark on a journey to design molecules. As he commenced his PhD, he delved into the burgeoning field of AI diffusion models, which generate data akin to the training data of protein images. Recognizing the shared principle of “corrupting” training data with noise and subsequently learning to recreate data similar to the original training set by rectifying the applied corruption, Lee realized that the AI models employed in image generation could also be applied to protein design. Through this process of training, the AI becomes capable of generating novel structures akin to those it has been exposed to.

The significance of ProteinSGM lies in its ability to design entirely novel proteins with precise binding targets—molecules that elicit specific reactions upon interacting with proteins. Lee emphasizes that AI-designed proteins possess structural resemblances to their natural counterparts in nearly every aspect, rendering them ideal for functional purposes. Furthermore, AI-generated amino acid sequences obviate the need for the selective pressures imposed by natural selection to develop functional structures, granting ProteinSGM the capacity to generate a greater diversity of structures.

Conventionally, the immune system’s white blood cells produce specific antibodies—a class of proteins that effectively attack and neutralize invading pathogens and foreign substances. White blood cells undergo various mutations to generate a wide range of antibodies, with those exhibiting high affinities for particular foreign substances being selected for proliferation.

However, modeling the intricate structures of antibodies, vital for effective binding to foreign substances, has proven to be a formidable challenge for AI. Nevertheless, Kim and Lee harbor hope that ProteinSGM might hold the key to alleviating this predicament. This breakthrough would enable researchers to explore the realm of designing and modeling therapeutic antibodies, circumventing the time-consuming conventional approach that relies on breeding antibodies within animals’ immune systems and subsequently harvesting them.

Antibody-based therapies have demonstrated immense potential in treating a diverse range of conditions, including cancer, autoimmune disorders, and infectious diseases. However, the efficacy of these therapies hinges on binding dynamics and sequence diversity. Herein lies the niche that ProteinSGM aims to occupy: experimental validation reveals that diffusion models like ProteinSGM currently boast a success rate in designing functional antibodies that surpasses previous computational and screening methods by a factor of 50. Lee envisions further advancements that will enhance this success rate with time.

Conclusion:

The integration of artificial intelligence into protein engineering through models like ProteinSGM represents a major breakthrough in the field. This advancement opens up new possibilities for designing novel proteins with specific binding targets, particularly in the realm of therapeutic antibodies. The ability to generate structurally similar proteins to their natural counterparts, coupled with the potential for increased diversity in structures, holds great promise for the market. It has the potential to revolutionize antibody-based therapies, offering more effective treatment options for conditions ranging from cancer to infectious diseases. The higher success rate achieved by AI diffusion models like ProteinSGM signifies a significant leap forward in computational protein design, promising a brighter future for the market and paving the way for further innovations in the field.

Source