Proteins are large, very complex, and naturally occurring molecules that can be found in all living organisms. These unique substances, which consist of amino acids linked together by peptide bonds to form long chains, can have different functions and properties.
The specific order in which different amino acids are arranged to form a protein ultimately determines the 3D structure of the protein, its physicochemical properties, and its molecular function. While scientists have been studying proteins for decades, designing proteins that cause specific chemical reactions has so far proved extremely challenging.
Researchers from Biomatter Designs, Vilnius University in Lithuania and Chalmers Technical University in Sweden have recently developed ProteinGAN, a generative racing network (GAN) that can process and “learn”
“Proteins are long sequences of amino acids that cause processes to occur in all living systems, challenging humans,” Alexei Zheleznyak, an associate professor at Chalmers University of Technology who led the study, told Phys.org. “Proteins are often used in our daily lives and are included in countless products, from washing powders to cancer and coronavirus therapies. They are made up of 20 amino acids, which are arranged in different sequences and their order determines the function of the protein.”
Creating functional protein sequences is a very challenging task, as even a slight change in a sequence can make the protein non-functional. Non-functional proteins can have harmful and side effects, such as causing humans or animals to develop cancer or other diseases.
“If someone wants to make proteins tailored to human needs, he / she must properly understand the order of amino acids and the given astronomical number of possibilities for the production of these proteins, which is not a minor task,” said Zheleznyak. “Inspired by the latest developments in AI, especially realistic photo and video generation, we were tempted to ask if current AI technology is ready to produce the most complex molecules known to humans – proteins.”
ProteinGAN, the model developed by Zheleznyak and his colleagues, is based on a well-known machine learning approach known as competitive training. Competitive learning can be seen as a game “played” by two or more artificial neural networks. The first of these networks, known as the ‘generator’, creates a specific type of data (eg image, text or, in the case of ProteinGAN, a protein sequence). The second network, known as the ‘discriminator’, attempts to distinguish between artificial data (such as a protein sequence) created by the ‘generator’ and authentic or real data.
Subsequently, the generator uses the feedback provided by the discriminator (ie the characteristics that allow it to distinguish the generated data from the actual data) to generate new data. The generator never processes or analyzes real data and the data it produces. Therefore, her training relies solely on the results of the analyzes carried out by the discriminator.
“By repeating this process iteratively, both networks improve in what they do, while the generated sequences cannot be distinguished from the real ones,” Zheleznyak said. “Using the artificial intelligence tool we developed, we were able to generate functional proteins that were active but did not exist in nature or have not yet been discovered.”
In the initial experiments conducted by the researchers, ProteinGAN generated new and very diverse protein sequences with physical properties that are similar to those of natural protein sequences. Using malate dehydrogenase (MDH) as the main enzyme, Zelezniak and colleagues have shown that many of the sequences generated by ProteinGAN are soluble and exhibit MDH catalytic activity, meaning that they may have interesting applications in medical and research settings. In the future, ProteinGAN can be used to discover new protein sequences with different properties that may prove valuable for different technological and scientific applications.
“Our research laboratory focuses on AI-based technologies for applications of synthetic biology,” Zheleznyak said. “We are currently working to address emerging issues such as plastic pollution and I believe that AI will help build better organisms suitable for this particular problem.”
A unique AI method for generating proteins to accelerate drug development
Expansion of the functional spaces of a protein sequence with the help of generative rival networks. Intelligence of the machine of nature(2021). DOI: 10.1038 / s42256-021-00310-5.
© 2021 Science X Network
Quote: ProteinGAN: Generative racing network that generates functional protein sequences (2021, April 2), retrieved on April 4, 2021 from https://phys.org/news/2021-04-proteingan-adversarial-network-functional -protein.html
This document is subject to copyright. Except for any fair transaction for the purpose of private investigation or research, no part may be reproduced without written permission. The content is provided for informational purposes only.