Unlocking the Secrets of Proteins With Cutting-Edge AI

AI Tool Protein Prediction

DeepGO-SE, an AI tool created by KAUST researchers, revolutionizes the prediction of unknown protein functions using logical entailment and advanced language models, showing significant potential for scientific research and biotechnological applications. Credit: SciTechDaily.com

KAUST’s DeepGO-SE AI tool excels in predicting functions of unknown proteins, offering promising applications in biotechnology and research.

A new artificial intelligence (AI) tool that draws logical inferences about the function of unknown proteins promises to help scientists unravel the inner workings of the cell.

Developed by KAUST bioinformatics researcher Maxat Kulmanov and colleagues, the tool outperforms existing analytical methods for forecasting protein functions and is even able to analyze proteins with no clear matches in existing datasets.

Advancements in Protein Function Analysis

The model, termed DeepGO-SE, takes advantage of large language models similar to those used by generative AI tools such as Chat-GPT. It then employs logical entailment to draw meaningful conclusions about molecular functions based on general biological principles about the way proteins work.

It essentially empowers computers to logically process outcomes by constructing models of part of the world — in this case, protein function — and inferring the most plausible scenario based on common sense and reasoning about what should happen in these world models.

AI Tool Predicts Function of Unknown Proteins

A new artificial intelligence (AI) tool that draws logical inferences about the function of unknown proteins promises to help scientists unravel the inner workings of the cell. Credit: © 2024 KAUST; Ivan Gromicho

Collaborative Research and Applications

“This method has many applications,” says Robert Hoehndorf, head of the KAUST Bio-Ontology Research Group, who supervised this research, “especially when it is necessary to reason over data and hypotheses generated by a neural network or another <span class="glossaryLink" aria-describedby="tt" data-cmtooltip="

machine learning
Machine learning is a subset of artificial intelligence (AI) that deals with the development of algorithms and statistical models that enable computers to learn from data and make predictions or decisions without being explicitly programmed to do so. Machine learning is used to identify patterns in data, classify data into different categories, or make predictions about future events. It can be categorized into three main types of learning: supervised, unsupervised and reinforcement learning.

” data-gt-translate-attributes=”[{"attribute":"data-cmtooltip", "format":"html"}]” tabindex=”0″ role=”link”>machine learning model,” he adds.

Kulmanov and Hoehndorf collaborated with KAUST’s Stefan Arold, as well as researchers at the Swiss Institute of Bioinformatics, to assess the model’s ability to decipher the functions of proteins whose role in the body are unknown.

The tool successfully used data regarding the amino <span class="glossaryLink" aria-describedby="tt" data-cmtooltip="

acid
Any substance that when dissolved in water, gives a pH less than 7.0, or donates a hydrogen ion.

” data-gt-translate-attributes=”[{"attribute":"data-cmtooltip", "format":"html"}]” tabindex=”0″ role=”link”>acid sequence of a poorly understood protein and its known interactions with other proteins and precisely predicted its molecular functions. The model was so accurate that DeepGO-SE was ranked in the top 20 of more than 1,600 algorithms in an international competition of function prediction tools.

Impact and Future Directions

The KAUST team is now using the tool to investigate the functions of enigmatic proteins discovered in plants that thrive in the extreme environment of the Saudi Arabian desert. They hope that the findings will be useful for identifying novel proteins for biotechnological applications and would like other researchers to embrace the tool.

As Kulmanov explains: “DeepGO-SE’s ability to analyze uncharacterized proteins can facilitate tasks such as drug discovery, metabolic pathway analysis, disease associations, protein engineering, screening for specific proteins of interest, and more.”

Reference: “Protein function prediction as approximate semantic entailment” 14 February 2024, Nature Machine Intelligence.
DOI: 10.1038/s42256-024-00795-w