Quantum statistical approach quiets big, noisy data

April 3, 2025

By Louis DiPietro

Big data got too big.

A research team with statisticians from Cornell has developed a data representation method inspired by quantum mechanics that handles large data sets more efficiently than traditional methods by simplifying them and filtering out noise.

This method could spur innovation in data-rich but statistically intimidating fields, like health care and epigenetics, where traditional data methods have thus far proved insufficient.

“Physicists and allied scientists have developed quantum mechanics-based tools that offer concise mathematical representations of complex data,” said Martin Wells, the Charles A. Alexander Professor of Statistical Sciences in the Cornell Ann S. Bowers College of Computing and Information Science and the ILR School. He is a co-author of “Robust Estimation of the Intrinsic Dimension of Data Sets with Quantum Cognition Machine Learning,” which was published in Scientific Reports on February 26. “We're borrowing and using their mathematical structure from quantum mechanics to understand the structure of data.”

Before spinning data into innovation or medical breakthroughs, data scientists must first get a sense of the data's complexity. To do this, scholars – particularly those working in areas like network analysis and health sciences – have traditionally turned to a technique called intrinsic dimension estimation, which helps data scientists get the gist of a massive data set without analyzing every detail. The problem is, intrinsic dimension estimation can get thrown off by noise and complexity, and real-world data is often both, researchers said.

“When you use these intrinsic dimension estimation techniques, they very often get the wrong answer by quite a big margin, and they disagree with each other,” said Luca Candelori, lead author and director of research at Qognitive, an artificial intelligence startup. “It's very hard to apply them on real data sets and to get an actual estimate.”

The team’s AI-powered model represents a finely tuned version of the intrinsic dimension estimation technique, making it more accurate, less susceptible to noise, and thus better suited to handle today’s complicated data sets. In tests of both real-world data and artificial data sets, which were intentionally made noisier, the team’s model maintained consistent estimates, researchers said.

The team’s method is based on “quantum cognition machine learning,” an approach to AI training developed by Qognitive that is based on the flexible, nuanced ways humans think, and not on traditional probability theory, as is standard practice today. Using these traditional methods to train state-of-the-art tools, like large language models, costs too much and saps too much energy, Candelori said.

“A lot of the motivation for developing quantum cognition machine learning is to try to find a more economical way of representing data and the distribution of data,” Candelori said.

Researchers note that while quantum cognition machine learning uses quantum mathematics, it does not require powerful and pricey quantum computing hardware; it can be run on standard laptops.

“This quantum aspect is a game-changer,” Wells said. “It provides access to mathematical and statistical tools that weren’t available just three years ago.”

Along with Candelori and Wells, the paper's authors are: Cameron Hogan, a doctoral student in the field of statistics; Alexander Abanov, professor in the Department of Physics and Astronomy at Stony Brook University; Mengjia Xu, assistant professor of data science at the New Jersey Institute of Technology; and Kharen Musaelian, Jeffrey Berger, Vahagn Kirakosyan, Ryan Samson, James Smith, and Dario Villani, all of Qognitive.

This research was supported by the National Institutes of Health, the U.S. Department of Energy, the U.S. Air Force Office of Scientific Research, and the Department of Mathematics at King’s College London.

Louis DiPietro is a writer for the Cornell Ann S. Bowers College of Computing and Information Science.

Departments

The college

Students

Research & Faculty

Corporate Partners

Story