Research Project Speeds Process for Converting Programs to Multicore
- By Dian Schaffhauser
- 11/10/16
A team of researchers has come up with a technique that lets people describe what their single-core programs are supposed to do in somewhat general terms, and then it automatically generates a version of the program optimized to run on multicore chips. The benefit is that the new version of the program will generate the same results but faster.
"Bellmania," as the technique is called, is described in a paper recently presented at the ACM SPLASH 2016 event, the Association of Computing Machinery's conference on Systems, Programming, Languages and Applications: Software for Humanity. The researchers come from MIT's Computer Science and Artificial Intelligence Laboratory and Stony Brook University.
Multicore computing speeds up programming by splitting up tasks and handing them off to multiple CPUs or cores for processing, storing and reusing results of computations rather than having to re-compute them each time they're needed. Traditionally, the splitting up is done by "hand"; computer scientists manually figure out how to "parallelize" programs.
Bellmania's basic concept is to break a given problem into smaller and smaller subproblems with "recursive divide and conquer." The researcher who needs to run the problem or algorithm describes the first step of the process — how the problem is to be divided in a first go-around and what procedures are to be applied to the resulting segments — and Bellmania figures out how to continue the division in a way that optimizes memory usage.
At each level of the recursion Bellmania will perform an operation on some segment of the matrix and then "outsource" the rest of the work to the subroutines, which can run in parallel. Each subroutine will in turn perform an operation on a segment of the data and hand off remaining work to further subroutines.
"The goal is to arrange the memory accesses such that when you read a cell [of the matrix], you do as much computation as you can with it, so that you will not have to read it again later," said Shachar Itzhaky, first author on the paper and a post-doctoral student in electrical engineering and computer science at MIT, in an article about the project.
Bellmania's magic is in determining how best to manage that handoff by figuring out how much data should be processed at each level and which subroutines will be needed as the program progresses. That parallelization process takes about 15 minutes per algorithm — which is much faster than any human programmer could do the work. As the paper noted, "The resulting implementations are 1.4–46 [times] faster than the original versions and within 2 [percent] to 60 percent of a high-performance hand-crafted implementation."
About the Author
Dian Schaffhauser is a former senior contributing editor for 1105 Media's education publications THE Journal, Campus Technology and Spaces4Learning.