MIT Researchers Improve Datacenter Cache Efficiency with Flash Memory
Researchers from the Massachusetts Institute of Technology have found a way to reduce data center energy usage.
The team, from MIT's Computer Science and Artificial Intelligence laboratory (CSAIL), reported that it can reduce energy requirements by using flash memory — the same kind used in smartphones — for data center caching.
A data center for a major service might have up to 1,000 servers devoted just to caching the most frequent database queries for faster access.
"Per gigabyte of memory, flash consumes about 5 percent as much energy as RAM and costs about one-tenth as much," according to information released by MIT. "It also has about 100 times the storage density, meaning that more data can be crammed into a smaller space. In addition to costing less and consuming less power, a flash caching system could dramatically reduce the number of cache servers required by a data center."
But flash is also much slower than RAM.
"That's where the disbelief comes in," said MIT's Arvind, senior author on the conference paper, in a prepared statement. "People say, 'Really? You can do this with flash memory?' Access time in flash is 10,000 times longer than in DRAM [dynamic RAM]."
But the researchers say that it doesn't matter, as the difference between getting a result from traditional servers — 0.0002 seconds — and getting it from those using flash — 0.0004 seconds — is too small for users to notice.
What is a challenge, however, is keeping up with all the queries. The system created by Arvind and his team, BlueCache, uses a variety of techniques to keep up.
First, BlueCache uses pipelining, which means it begins working on successive queries while it's answering the first. It can handle up to 10,000 at once and, though the first query takes 200 microseconds to return, each thereafter is returned at .02 microsecond intervals.
The system also uses a little DRAM for each query. The DRAM is not used to make cache lookups faster, but to improve the efficiency of identifying information not yet added to the cache.
Most cache systems use software to read from the cache, write to the cache or delete something from the cache. In the BlueCache system, these operations are performed by hardware circuits, improving speed and energy usage, developed by team member Shuotao Xu, MIT graduate student in electrical engineering and first author on the paper.
The system also ensures that it is using bandwidth as efficiently as possible by amassing enough queries to use full capacity before sending them to memory.
Altogether, the techniques enable the system to read as efficiently as a DRAM-based system and to write as efficiently as long as the data retrieved is at least 8 KB, the minimum amount of data flash can retrieve, and only consumes 4 percent as much power.
"The flash-based KV store architecture developed by Arvind and his MIT team resolves many of the issues that limit the ability of today's enterprise systems to harness the full potential of flash," said Vijay Balakrishnan, director of the Data Center Performance and Ecosystem program at Samsung Semiconductor's Memory Solutions Lab, according to an MIT news release. "The viability of this type of system extends beyond caching, since many data-intensive applications use a KV-based software stack, which the MIT team has proven can now be eliminated. By integrating programmable chips with flash and rewriting the software stack, they have demonstrated that a fully scalable, performance-enhancing storage technology, like the one described in the paper, can greatly improve upon prevailing architectures."
About the Author
Joshua Bolkan is contributing editor for Campus Technology, THE Journal and STEAM Universe. He can be reached at [email protected].