New MIT Tool Speeds Database Transactions
- By Dian Schaffhauser
- 09/20/12
Detecting a lag in that online transaction you're trying to finish? A project underway at MIT and Cornell University heralds performance improvements in database applications.
As the researchers noted in a recent paper, "Automatic Partitioning of Database Applications," "Database-backed applications are nearly ubiquitous in our daily lives." Those include transactions performed by a user doing online shopping, searching for an airline schedule, or just looking up names of colleagues in an online directory; in each of these transactions, applications make many small accesses to a database in order to feed results to the Web page being viewed by the user. But those multiple round trips between the server hosting the data and the server hosting the web application can cost time--latency--and bandwidth resources. Each transaction can consume a few hundred milliseconds to execute application logic, retrieve query results, and generate HTML. Also, each query puts a lock on the data to retrieve stable results, which can limit throughput for highly concurrent systems.
A common technique for improving performance for these types of computing activities is to convert pieces of the application into stored procedures, which run on the server hosting the database. For multiple reasons that conversion is difficult, the paper reported.
To address the latency and resource consumption problems, the researchers have developed Pyxis, a new "program partitioner" for database applications. The technology consumes source code written in Java, the common language used by Web developers, and figures out a way to split up the code with the goal of minimizing the number of roundtrips and data transferred between the two servers.
According to the research paper, Pyxis profiles the application and server loads, analyzes the code's dependencies, and "produces a partitioning that minimizes the number of control transfers as well as the amount of data sent during each transfer." In other words, parts of the application logic run on the database server instead.
In benchmarking tests Pyxis has shown that it can generate partitions that reduce latency up to three times and speeds up the shuttle between servers by 1.7 times when compared with traditional non-partitioned implementations. It also shows comparable performance to that of a custom written stored procedure implementation.
Besides those benefits, the researchers pointed out, similar speedups require the use of special-purpose programming languages; Pyxis works with the languages already favored by Web developers.
The technique, shared recently during the 38th International Conference on Very Large Databases, was developed by first author Alvin Cheung, a graduate student in MIT's Department of Electrical Engineering and Computer Science (EECS), EECS professor Sam Madden, and Owen Arden and Andrew Myers of Cornell's Department of Computer Science.
Pyxis automatically partitions a program between application server and database server in a way that can be mathematically proven not to disrupt the operation of the program. It also monitors the CPU load on the database server, giving it more or less application logic to execute depending on its available capacity.
At the moment, Pyxis works with programs written in Java. But the researchers said they believe it could be adapted to other popular languages by revising only the code that translates programs into partition graphs, representing information about the program's dependencies, such as the amount of data that one instruction in the code passes to the next instruction.
Development of Pyxis is part of a larger project called StatusQuo, a new programming system for developing database applications in Java. StatusQuo allows developers to get away from having to write anything in the database query language SQL, hassle with trying to figure out what computation logic could be reasonably shifted to the database server, write stored procedures, or worry over security measures that may differ between the database server and the application server and that can introduce vulnerabilities that could lead to data breaches.
About the Author
Dian Schaffhauser is a former senior contributing editor for 1105 Media's education publications THE Journal, Campus Technology and Spaces4Learning.