MIT, Stanford Project Protects Security of Genomic Data for Open Research

In a paper appearing in the journal Nature Biotechnology, researchers from MIT and Stanford University have described a new system they've developed for protecting the privacy of people who contribute their genomic data to large-scale biomedical studies. These studies are intended to uncover links among genetic variations in identifying the causes for diseases.

As the researchers explained, most sequenced genomes are currently kept in "strict access-controlled repositories." Giving free access to the data through "association studies" could speed up the research. Yet, people concerned about the data privacy of their genetic make-ups may refrain from contributing their genomes to scientific studies. For example, one expert claimed to be able to analyze raw genomic data to determine the shape of faces; and researchers have shown how to triangulate genomic information with other data to elicit the identity of somebody.

The protocol, developed by MIT's Hyunghoon Cho and Bonnie Berger and Stanford's David Wu, is intended to help make currently restricted data available to the scientific community, potentially enabling secure genome crowdsourcing while still making sure individuals can contribute their genomes to a study without compromising their privacy.

The heart of the technique is to distribute sensitive data among multiple servers. As an MIT article on the topic explained, to store the number x, the system might send a random number, r, to one server, and x-r to another. Neither server would be able to calculate x on its own. But together, they could "still perform useful operations." If a cybercriminal wanted to figure out what x was, he or she would need to break into both servers — or as many servers as were involved. As servers are added to the setup, the cryptography approach becomes more complicated.

Association studies involve a massive table — or matrix — that maps the genomes in the database against the locations of genetic variations. These variations typically number about a million, requiring a million-by-million matrix, making security a complicated affair and the research effort time-consuming.

But Cho, Berger and Wu have developed techniques to simplify the security calculations and speed up the processing of their system. Based on those techniques, the system accurately reproduced three published genome-wide association studies involving up to 23,000 individual genomes. The approach could feasibly scale to a million individuals, they predict.

"As biomedical researchers, we're frustrated by the lack of data and by the access-controlled repositories," said Berger, a professor of math. "We anticipate a future with a landscape of massively distributed genomic data, where private individuals take ownership of their own personal genomes, and institutes as well as hospitals build their own private genomic databases. Our work provides a roadmap for pooling together this vast amount of genomic data to enable scientific progress."

The paper is available behind a registration wall at Nature Biotechnology.

About the Author

Dian Schaffhauser is a former senior contributing editor for 1105 Media's education publications THE Journal, Campus Technology and Spaces4Learning.

Featured

  • consumer electronic devices—laptop, tablet, smartphone, and smart speaker—on a wooden surface with glowing AI icons hovering above

    OpenAI to Acquire Io, Plans Consumer AI Hardware Push

    OpenAI has announced plans to acquire io, an artificial intelligence hardware startup co-founded by former Apple design chief Jony Ive. The deal is aimed at creating a dedicated division for the development of AI-powered consumer devices.

  • glowing digital brain made of blue circuitry hovers above multiple stylized clouds of interconnected network nodes against a dark, futuristic background

    Report: 85% of Organizations Are Using Some Form of AI

    Eighty-five percent of organizations today are leveraging some form of AI, according to the latest State of AI in the Cloud 2025 report from Wiz. While AI's role in innovation and disruption continues to expand, security vulnerabilities and governance challenges remain pressing concerns.

  •  floating digital interface with glowing icons, surrounded by faint geometric shapes

    Digital Education Council Defines 5 Dimensions of AI Literacy

    A recent report from the Digital Education Council, a global community devoted to "revolutionizing the world of education and work through technology and collaboration," provides an AI literacy framework to help higher education institutions equip their constituents with foundational AI competencies.

  • college building with a central domed rotunda, arched windows, and columns, overlaid with glowing blue circuit patterns

    Kishwaukee College Moves to Ellucian Colleague SaaS

    Illinois's Kishwaukee College is modernizing its administrative systems with an Ellucian Colleague SaaS rollout that will bring AI-powered tools to human resources, finance, and student management.