Big tech companies such as Netflix, Facebook and Amazon use machine learning software to predict what users will do next. For example, Netflix will look at what you have been watching and use its machine learning software to predict what you would like to do next. Similarly, Facebook will analyse your current friends and activity on the social media platform to suggest new connections and pages to ‘like’.
Scientists from the University of Cambridge believe this sort of artificial intelligence (AI) could be used to “train a large-scale language model” to examine what happens when something “goes wrong with proteins” to cause disease.
Proteins are required for the structure, function and regulation of the body’s tissues and organs.
The likes of Alzheimer’s disease occurs when proteins go “rogue” and form clumps which kill healthy nerve cells in the brain.
A healthy brain has the ability to dispose of these clumps, known as aggregates, but experts believe some disordered proteins also form droplets of proteins called condensates.
Condensates merge freely with other clumps, but they can form and reform, unlike aggregates which are permanent, according to the research published in the journal PNAS.
Professor Tuomas Knowles, lead author of the paper and a Fellow at St John’s College, said: “Protein condensates have recently attracted a lot of attention in the scientific world because they control key events in the cell such as gene expression – how our DNA is converted into proteins – and protein synthesis – how the cells make proteins.
“Any defects connected with these protein droplets can lead to diseases such as cancer.
“This is why bringing natural language processing technology into research into the molecular origins of protein malfunction is vital if we want to be able to correct the grammatical mistakes inside cells that cause disease.”
According to the researchers, the AI could be used to make more advanced discoveries than humans could and look to “correct the grammatical mistakes inside cells that cause disease”.
Prof Knowles added: “Bringing machine-learning technology into research into neurodegenerative diseases and cancer is an absolute game-changer.
“Ultimately, the aim will be to use artificial intelligence to develop targeted drugs to dramatically ease symptoms or to prevent dementia happening at all.”
Dr Kadi Liis Saar, first author of the paper and a Research Fellow at St John’s College, said: “The human body is home to thousands and thousands of proteins and scientists don’t yet know the function of many of them.
“We asked a neural network based language model to learn the language of proteins.
“We specifically asked the programme to learn the language of shapeshifting biomolecular condensates – droplets of proteins found in cells – that scientists really need to understand to crack the language of biological function and malfunction that cause cancer and neurodegenerative diseases like Alzheimer’s.
“We found it could learn, without being explicitly told, what scientists have already discovered about the language of proteins over decades of research.
“We fed the algorithm all of data held on the known proteins so it could learn and predict the language of proteins in the same way these models learn about human language and how WhatsApp knows how to suggest words for you to use.
“Then we were able ask it about the specific grammar that leads only some proteins to form condensates inside cells.
“It is a very challenging problem and unlocking it will help us learn the rules of the language of disease.”