MIT Research Aims to Build AI-Enabled Materials Recipe Database

Research led by a professor at MIT aims to use artificial intelligence to help materials scientists find recipes for particular materials in a sea of data.

"Computational materials scientists have made a lot of progress in the 'what' to make — what material to design based on desired properties," said Elsa Olivetti, the Atlantic Richfield Assistant Professor of Energy Studies in MIT's Department of Materials Science and Engineering (DMSE) and lead researcher on the project, in a prepared statement. "But because of that success, the bottleneck has shifted to, 'Okay, now how do I make it?'"

Eventually Olivetti and her team, with members from University of Massachusetts Amherst and University of California, Berkeley, hopes their research might eventually help to develop a database of materials recipes extracted from millions of papers and searchable by name, property, precursor materials or any other characteristic.

The team has created a machine learning process that can analyze papers, figure out which paragraphs hold materials recipes and classify the words of those paragraphs depending upon their role in the recipe, such as the names of target materials, pieces of equipment, numeric quantities, descriptors or operating conditions.

Because this is a new area of research without many annotated research papers, the team had to annotate papers themselves and ended up with about 100 samples to train their algorithm on.

"By machine-learning standards, that's a pretty small data set," according to a news release. "To improve it, they used an algorithm developed at Google called Word2vec. Word2vec looks at the contexts in which words occur — the words' syntactic roles within sentences and the other words around them — and groups together words that tend to have similar contexts. So, for instance, if one paper contained the sentence 'We heated the titanium tetrachloride to 500 C,' and another contained the sentence 'The sodium hydroxide was heated to 500 C,' Word2vec would group 'titanium tetrachloride' and 'sodium hydroxide' together."

Using this technique, the researchers were able to expand their training set from about 100 papers to about 64,000. After training and testing, their system was able to identify paragraphs with recipes accurately 99 percent of the time and to accurately categorize the words within them 86 percent of the time.

"This is landmark work," said Ram Seshadri, the Fred and Linda R. Wudl Professor of Materials Science at the University of California at Santa Barbara, in a prepared statement. "The authors have taken on the difficult and ambitious challenge of capturing, through AI methods, strategies employed for the preparation of new materials. The work demonstrates the power of machine learning, but it would be accurate to say that the eventual judge of success or failure would require convincing practitioners that the utility of such methods can enable them to abandon their more instinctual approaches."

About the Author

Joshua Bolkan is contributing editor for Campus Technology, THE Journal and STEAM Universe. He can be reached at [email protected].

Featured