Proteins are ubiquitous in all known life forms. They play instrumental roles in many biological processes without which life wouldn’t exist. Proteins are amino acid chains that fold in many different tridimensional shapes for functional purposes. Each structure is entirely function-specific, and any abnormality in the way a protein folds can render it pathogenic or even toxic. Misfolded proteins are at the base of numerous diseases and allergies. Understanding how proteins fold has, therefore, huge implications for human understanding of life processes and medicine. This knowledge is vital to develop new drugs to treat or cure diseases in the future.
Ongoing attempts to model and/or decipher the algorithms behind protein folding rely on costly computational resources, and have given rise to large scale distributed computing projects such as [email protected]. But the technical hurdles with the current technological paradigms are gigantic. New computing architectures are urgently needed to speed up progress in this field.
The potential of machine learning
Thanks to the fast-evolving field of machine learning, protein folding research may be on the verge of a paradigm shift. Researchers at the University of Toronto developed a method to determine the actual 3D shape of proteins by employing advanced microscopy (electron cryomicroscopy) and machine learning algorithms. After the extremely powerful microscope takes thousands of pictures of a frozen protein sample (smaller than the wavelength of visible light), the images are fed into the machine learning software, which uses them to find the protein’s 3D structure.
One of the greatest advantages of employing machine learning in this manner is that the determination of the correct protein structure doesn’t require prior knowledge about the object being modeled. The puzzle is solved autonomously. Scientists claim that this method can solve the 3D shape of a protein in a matter of minutes on a single computer. This is significantly faster than current methods, which take days or even weeks to do the same with large clusters of computers and still rely on human input of prior knowledge for accuracy.
Faster and more precise medicine
The team has made the machine learning algorithms available as software for academic or commercial use. It is called cryoSPARC. This breakthrough would be a huge time and money-saver in research fields dealing with the gathering of knowledge about life processes at the atomic or molecular level – cell biology, genetics, biochemistry, immunology or pharmacology. The greatest impact on the lives of people would be the faster development of new pharmaceuticals that could cure previously incurable diseases, as well as more efficient drugs that cause little or no side effects. The promising possibilities of this new method of discovering the structures of proteins have huge implications in medicine.