If you're looking for a needle in a haystack, it's best to know what hay looks like. An international team of researchers has applied this idea to the search for new pharmaceuticals, developing a technique that reduces the chances of simply rediscovering known compounds.
In an article published today in the journal Nature Communications, researchers from Carnegie Mellon University; the University of California, San Diego; and St. Petersburg State University in Russia describe a new means of searching vast repositories of compounds produced by microbes. By analyzing the mass spectra of the compounds, they were able to identify known compounds within the repository and eliminate them from further analysis, focusing instead on the unknown variants -- the needles within the haystack -- that might potentially be better or more efficient antibiotics, anticancer drugs or other pharmaceuticals.
In just a week, running on 100 computers, the algorithm, called Dereplicator+, sorted through a billion mass spectra in the Global Natural Products Social molecular network at UC San Diego and identified more than 5,000 promising, unknown compounds that merit further investigation, said Hosein Mohimani, assistant professor in CMU's Computational Biology Department and first author on the article.
The algorithm that powers this molecular search engine is now available for use by any investigator to study additional repositories.
In the past, mass spectrometry data repositories have been underused because it was difficult to search through them and because those efforts to date have been plagued by high rates of rediscovery of known compounds.
"It's unbelievable how many times people have rediscovered penicillin," Mohimani said.
Analyzing the compounds' mass spectra -- essentially, a measurement of the masses within a sample that has been ionized -- is a relatively inexpensive way of identifying possible new pharmaceuticals. But existing techniques were largely limited to peptides, which have simple structures such as chains and loops.
"We were only looking at the tip of the iceberg," Mohimani said.
To analyze the larger number of complex compounds that have entangled structures and numerous loops and branches, the researchers developed a method for predicting how a mass spectrometer would break apart the molecules. Beginning with the weakest rings, the method simulated what would happen as the molecules came apart. Using 5,000 known compounds and their mass spectra, they trained a computer model that could then be used to predict how other compounds would break down.
Mohimani said Dereplicator+ not only can identify known compounds that don't need to be investigated further, but it can also find less common variants of the known compounds that likely would go undetected within a sample.
In addition to Mohimani, investigators included Alexey Gurevich, Alexander Slemov, Alla Mikheenko, Anton Korobeynikov and Egor Shcherbin of St. Petersburg; Louis-Felix Nothias, Pieter C. Dorrestein and Pavel A. Pevzner of UC San Diego; and Liu Cao of CMU's Computational Biology Department.
The National Institutes of Health, Carnegie Mellon and the Russian Science Foundation supported this research.
Source: Carnegie Mellon University
APIC Salutes 2025 Trailblazers in Infection Prevention and Control
June 18th 2025From a lifelong mentor to a rising star, the Association for Professionals in Infection Control and Epidemiology (APIC) honored leaders across the career spectrum at its 2025 Annual Conference in Phoenix, recognizing individuals who enhance patient safety through research, leadership, and daily practice.
Building Infection Prevention Capacity in the Middle East: A 7-Year Certification Success Story
June 17th 2025Despite rapid development, the Middle East faces a critical shortage of certified infection preventionists. A 7-year regional initiative has significantly boosted infection control capacity, increasing the number of certified professionals and elevating patient safety standards across health care settings.
Streamlined IFU Access Boosts Infection Control and Staff Efficiency
June 17th 2025A hospital-wide quality improvement project has transformed how staff access critical manufacturer instructions for use (IFUs), improving infection prevention compliance and saving time through a standardized, user-friendly digital system supported by unit-based training and interdepartmental collaboration.
Swift Isolation Protocol Shields Chicago Children’s Hospital During 2024 Measles Surge
June 17th 2025When Chicago logged its first measles cases linked to crowded migrant shelters last spring, one pediatric hospital moved in hours—not days—to prevent the virus from crossing its threshold. Their playbook offers a ready template for the next communicable-disease crisis.
Back to Basics: Hospital Restores Catheter-Associated UTI Rates to Prepandemic Baseline
June 16th 2025A 758-bed quaternary medical center slashed catheter-associated urinary tract infections (CAUTIs) by 45% over 2 years, proving that disciplined adherence to fundamental prevention steps, not expensive add-ons, can reverse the pandemic-era spike in device-related harm.