MIT Research Aims to Build AI-Enabled Materials Recipe Database

Research led by a professor at MIT aims to use artificial intelligence to help materials scientists find recipes for particular materials in a sea of data.

"Computational materials scientists have made a lot of progress in the 'what' to make — what material to design based on desired properties," said Elsa Olivetti, the Atlantic Richfield Assistant Professor of Energy Studies in MIT's Department of Materials Science and Engineering (DMSE) and lead researcher on the project, in a prepared statement. "But because of that success, the bottleneck has shifted to, 'Okay, now how do I make it?'"

Eventually Olivetti and her team, with members from University of Massachusetts Amherst and University of California, Berkeley, hopes their research might eventually help to develop a database of materials recipes extracted from millions of papers and searchable by name, property, precursor materials or any other characteristic.

The team has created a machine learning process that can analyze papers, figure out which paragraphs hold materials recipes and classify the words of those paragraphs depending upon their role in the recipe, such as the names of target materials, pieces of equipment, numeric quantities, descriptors or operating conditions.

Because this is a new area of research without many annotated research papers, the team had to annotate papers themselves and ended up with about 100 samples to train their algorithm on.

"By machine-learning standards, that's a pretty small data set," according to a news release. "To improve it, they used an algorithm developed at Google called Word2vec. Word2vec looks at the contexts in which words occur — the words' syntactic roles within sentences and the other words around them — and groups together words that tend to have similar contexts. So, for instance, if one paper contained the sentence 'We heated the titanium tetrachloride to 500 C,' and another contained the sentence 'The sodium hydroxide was heated to 500 C,' Word2vec would group 'titanium tetrachloride' and 'sodium hydroxide' together."

Using this technique, the researchers were able to expand their training set from about 100 papers to about 64,000. After training and testing, their system was able to identify paragraphs with recipes accurately 99 percent of the time and to accurately categorize the words within them 86 percent of the time.

"This is landmark work," said Ram Seshadri, the Fred and Linda R. Wudl Professor of Materials Science at the University of California at Santa Barbara, in a prepared statement. "The authors have taken on the difficult and ambitious challenge of capturing, through AI methods, strategies employed for the preparation of new materials. The work demonstrates the power of machine learning, but it would be accurate to say that the eventual judge of success or failure would require convincing practitioners that the utility of such methods can enable them to abandon their more instinctual approaches."

About the Author

Joshua Bolkan is contributing editor for Campus Technology, THE Journal and STEAM Universe. He can be reached at [email protected].

Featured

  • student reading a book with a brain, a protective hand, a computer monitor showing education icons, gears, and leaves

    4 Steps to Responsible AI Implementation

    Researchers at the University of Kansas Center for Innovation, Design & Digital Learning (CIDDL) have published a new framework for the responsible implementation of artificial intelligence at all levels of education.

  • three glowing stacks of tech-themed icons

    Research: LLMs Need a Translation Layer to Launch Complex Cyber Attacks

    While large language models have been touted for their potential in cybersecurity, they are still far from executing real-world cyber attacks — unless given help from a new kind of abstraction layer, according to researchers at Carnegie Mellon University and Anthropic.

  • Hand holding a stylus over a tablet with futuristic risk management icons

    Why Universities Are Ransomware's Easy Target: Lessons from the 23% Surge

    Academic environments face heightened risk because their collaboration-driven environments are inherently open, making them more susceptible to attack, while the high-value research data they hold makes them an especially attractive target. The question is not if this data will be targeted, but whether universities can defend it swiftly enough against increasingly AI-powered threats.

  • magnifying glass revealing the letters AI

    New Tool Tracks Unauthorized AI Usage Across Organizations

    DevOps platform provider JFrog is taking aim at a growing challenge for enterprises: users deploying AI tools without IT approval.