Research Project Speeds Process for Converting Programs to Multicore
        
        
        
			- By Dian Schaffhauser
 - 11/10/16
 
		
        A team of researchers has come up with a technique that lets  people describe what their single-core programs are supposed to do in somewhat  general terms, and then it automatically generates a version of the program  optimized to run on multicore chips. The benefit is that the new version of the  program will generate the same results but faster.
"Bellmania," as the technique is called, is  described in a  paper recently presented at the ACM  SPLASH 2016 event, the Association of Computing Machinery's conference on  Systems, Programming, Languages and Applications: Software for Humanity. The  researchers come from MIT's Computer Science  and Artificial Intelligence Laboratory and Stony Brook University.
Multicore computing speeds up programming by splitting up  tasks and handing them off to multiple CPUs or cores for processing, storing  and reusing results of computations rather than having to re-compute them each  time they're needed. Traditionally, the splitting up is done by  "hand"; computer scientists manually figure out how to  "parallelize" programs.
Bellmania's basic concept is to break a given problem into  smaller and smaller subproblems with "recursive divide and conquer."  The researcher who needs to run the problem or algorithm describes the first  step of the process — how the problem is to be divided in a first go-around and  what procedures are to be applied to the resulting segments — and Bellmania  figures out how to continue the division in a way that optimizes memory usage.
At each level of the recursion Bellmania will perform an  operation on some segment of the matrix and then "outsource" the rest  of the work to the subroutines, which can run in parallel. Each subroutine will  in turn perform an operation on a segment of the data and hand off remaining  work to further subroutines.
"The goal is to arrange the memory accesses such that  when you read a cell [of the matrix], you do as much computation as you can  with it, so that you will not have to read it again later," said Shachar  Itzhaky, first author on the paper and a post-doctoral student in electrical  engineering and computer science at MIT, in an article about the project.
Bellmania's magic is in determining how best to manage that  handoff by figuring out how much data should be processed at each level and  which subroutines will be needed as the program progresses. That parallelization  process takes about 15 minutes per algorithm — which is much faster than any  human programmer could do the work. As the paper noted, "The resulting  implementations are 1.4–46 [times] faster than the original versions and within  2 [percent] to 60 percent of a high-performance hand-crafted  implementation."
        
        
        
        
        
        
        
        
        
        
        
        
            
        
        
                
                    About the Author
                    
                
                    
                    Dian Schaffhauser is a former senior contributing editor for 1105 Media's education publications THE Journal, Campus Technology and Spaces4Learning.