MIT Researchers Aim to Bring Neural Nets to Smartphones

Neural networks have been behind many advancements in AI in recent years, underpinning systems designed to recognize speech or individual faces, among others. But neural nets are also large and power hungry, making them poor candidates to run on personal devices such as smartphones, and forcing apps that rely on neural nets to upload data to a server for processing. But researchers at MIT are working on a new kind of computer chip that might change that.

The new chip improves the speed of neural-network computation by three to seven times and reduces energy consumption by 94 to 95 percent, according to the research team.

The way processors generally work is to have memory on one part of the chip and processing on another part — and as computations occur, the data is moved back and forth.

"Since these machine learning algorithms need so many computations, this transferring back and forth of data is the dominant portion of the energy consumption," said Avishek Biswas, an MIT graduate student in electrical engineering and computer science and leader of the team developing the chip, in a prepared statement. "But the computation these algorithms do can be simplified to one specific operation, called the dot product. Our approach was, can we implement this dot-product functionality inside the memory so that you don't need to transfer this data back and forth?"

Most neural nets are arranged in layers of nodes, with nodes in one layer accepting input from the nodes below it and passing input up to nodes above it. The connections between nodes also have weights assigned to them, which determines how important the data traveling over that connection will be in the next round of computations.

"A node receiving data from multiple nodes in the layer below will multiply each input by the weight of the corresponding connection and sum the results," according to an MIT news release. "That operation — the summation of multiplications — is the definition of a dot product. If the dot product exceeds some threshold value, the node will transmit it to nodes in the next layer, over connections with their own weights."

In the new chip, input values are converted to voltage and multiplied by weight and then added together. The voltage is only then converted back to data for storage and further processing. The process allows the new chip to figure the dot products for multiple nodes in one step, eliminating the need to shovel data back and forth repeatedly. The group has demonstrated success with 16 nodes in its prototype.

"This is a promising real-world demonstration of SRAM-based in-memory analog computing for deep-learning applications," said Dario Gil, vice president of artificial intelligence at IBM, according to information released by MIT. "The results show impressive specifications for the energy-efficient implementation of convolution operations with memory arrays. It certainly will open the possibility to employ more complex convolutional neural networks for image and video classifications in IoT [the internet of things] in the future."

About the Author

Joshua Bolkan is contributing editor for Campus Technology, THE Journal and STEAM Universe. He can be reached at [email protected].

Featured