AI reveals previously unknown biology – we may not know half of what’s inside our cells

UC San Diego researchers are introducing Multi-Scale Integrated Cell (MuSIC), a technique that combines microscopy, biochemistry and artificial intelligence, revealing previously unknown cell components that could provide new clues to human development and disease. (Artist’s conceptual rendering.) Credit: UC San Diego Health Sciences

Artificial intelligence-based technique reveals previously unknown cell components that may provide new clues to human development and disease.

Most human diseases can be traced to defective parts of a cell – a tumor can grow because a gene has not been accurately translated into a certain protein or a metabolic disease develops because mitochondria, for example, do not fire properly. But to understand which parts of a cell can go wrong in a disease, scientists first need a complete list of parts.

By combining microscopy, biochemical techniques and artificial intelligence, researchers at the University of California San Diego School of Medicine and collaborators have taken what they believe could be a major leap forward in understanding human cells.

The technique, known as Multi-Scale Integrated Cell (MuSIC), will be described on November 24, 2021 in Nature.

“If you picture a cell, you probably see the colorful diagram in your cell biology textbook, showing mitochondria, endoplasmic reticulum, and nucleus. But is that the whole story? Absolutely not,” says Trey Ideker, PhD, a professor at the UC San Diego School of Medicine and Moores Cancer Center. “Scientists have long realized there’s more we don’t know than we know, but now we finally have a way to look deeper.”

Ideker led the study with Emma Lundberg, PhD, of KTH Royal Institute of Technology in Stockholm, Sweden and Stanford University.

Classical cell versus music

Left: Classic cell diagrams in the textbook imply that all parts are clearly visible and defined. (Credit: OpenStax/Wikimedia). Right: A new cell map generated by MuSIC technic reveals many new components. Gold nodes represent known cell components, purple nodes represent new components. The size of the node reflects the number of different proteins in that component. Credit: UC San Diego Health Sciences

In the pilot study, MuSIC revealed about 70 components in a human kidney cell line, half of which had never been seen before. In one example, the researchers saw a group of proteins that formed an unknown structure. Working with UC San Diego colleague Gene Yeo, PhD, they eventually determined that the structure is a novel complex of proteins that bind RNA. The complex is likely involved in splicing, an important cellular event that allows for gene-to-protein translation and helps determine which genes are activated at what time.

The insides of cells — and the many proteins found there — are usually studied using one of two techniques: microscope imaging or biophysical association. With imaging, researchers add fluorescent tags of different colors to proteins of interest and track their movements and associations across the field of view of the microscope. To look at biophysical associations, researchers can use an antibody specific to a protein to pull it out of the cell and see what else is attached to it.

The team has been interested in mapping the internal workings of cells for years. What is different about MuSIC is its use of deep learning to map the cell directly from cellular microscopy images.

“The combination of these technologies is unique and powerful because it is the first time that measurements at vastly different scales have been brought together,” said first study author Yue Qin, a graduate bioinformatics and systems biology student in Ideker’s lab.

Microscopes allow scientists to see down to the level of a single micron, about the size of some organelles, such as mitochondria. Smaller elements, such as individual proteins and protein complexes, cannot be seen through a microscope. Biochemical techniques, which start with a single protein, allow scientists to get down to the nanometer scale. (A nanometer is one billionth of a meter, or 1,000 microns.)

“But how do you bridge that gap from nanometer to micron scale? That has long been a major hurdle in the biological sciences,” said Ideker, who is also founder of the UC Cancer Cell Map Initiative and the UC San Diego Center for Computational Biology and Bioinformatics. “It turns out you can do it with artificial intelligence — looking at data from multiple sources and asking the system to piece it together into a model of a cell.”

The team trained the MuSIC artificial intelligence platform to look at all the data and construct a model of the cell. The system does not yet map the cell contents to specific locations, such as a textbook chart, in part because their locations are not necessarily fixed. Instead, the location of the components are fluid and change depending on the cell type and situation.

Ideker noted that this was a pilot study to test MuSIC. They only looked at 661 proteins and one cell type.

“The obvious next step is to blast through the whole human cell,” Ideker said, “and then move on to different cell types, people, and species. Ultimately, we may be able to better understand the molecular basis of many diseases by comparing what’s different.” is between healthy and diseased cells.”

Reference: “A multi-scale map of cell structure that fuses protein images and interactions” by Yue Qin, Edward L. Huttlin, Casper F. Winsnes, Maya L. Gosztyla, Ludivine Wacheul, Marcus R. Kelly, Steven M. Blue, Fan Zheng , Michael Chen, Leah V. Schaffer, Katherine Licon, Anna Bäckström, Laura Pontano Vaites, John J. Lee, Wei Ouyang, Sophie N. Liu, Tian Zhang, Erica Silva, Jisoo Park, Adriana Pitea, Jason F. Kreisberg, Steven P. Gygi, Jianzhu Ma, J. Wade Harper, Gene W. Yeo, Denis LJ Lafontaine, Emma Lundberg and Trey Ideker, November 24, 2021, Nature.
DOI: 10.1038/s41586-021-04115-9

Co-authors include: Maya L. Gosztyla, Marcus R. Kelly, Steven M. Blue, Fan Zheng, Michael Chen, Leah V. Schaffer, Katherine Licon, John J. Lee, Sophie N. Liu, Erica Silva , Jisoo Park, Adriana Pitea, Jason F. Kreisberg, UC San Diego; Edward L. Huttlin, Laura Pontano Vaites, Tian Zhang, Steven P. Gygi, J. Wade Harper, Harvard Medical School; Casper F. Winsnes, Anna Bäckström, Wei Ouyang, KTH Royal Institute of Technology; Ludivine Wacheul, Denis LJ Lafontaine, Université Libre de Bruxelles; and Jianzhu Ma, Peking University.

Funding for this study came in part from the National Institutes of Health (grants U54CA209891, U01MH115747, F99CA264422, P41GM103504, R01HG009979, U24HG006673, U41HG009889, R01HL137223, R01HG004659, R50CA243885), Google-Persson ​​Foundation2016 (grant). Council (grant 2017-05327), Belgian Fund de la Recherche Scientifique, Université Libre de Bruxelles, European Joint Program on Rare Diseases, Région Wallonne, Internationale Brachet Stiftung and Epitran COST action (grant CA16120).

Disclosures: Trey Ideker is co-founder of the Scientific Advisory Board and has an equity interest in Data4Cure, Inc. Ideker also serves on the scientific advisory board, has an equity interest in and receives sponsored research funding from Ideaya BioSciences, Inc. Gene Yeo is a co-founder, member of the Board of Directors, Scientific Advisory Board, a shareholder and a paid consultant for Locanabio and Eclipse BioInnovations. Yeo is also a visiting professor at the National University of Singapore. The terms of these arrangements have been reviewed and approved by the University of California San Diego in accordance with its conflict of interest policy. Emma Lundberg serves on the scientific advisory boards of and has equity interests in Cartography Biosciences, Nautilus Biotechnology and Interline Therapeutics. J. Wade Harper is a co-founder of, on the scientific advisory board and has an equity interest in Caraway Therapeutics. Harper is also a founding scientific advisor to Interline Therapeutics.

Leave a Comment