IBM Unveils Latest Tool for Real-Time Data Analysis
- By Jeffrey Schwartz
IBM has taken the wraps off a technology developed by its research labs that it believes will raise the threshold for analysis of massive amounts of data in real time.
Under development for six years through the IBM Research organization, System S is designed for what the company calls "perpetual analytics."
According to IBM, System S can capture thousands of real-time data streams--such as continuously changing stock prices--that individuals can analyze. "In the broad sense, it's an event streaming processing architecture with caching and the ability to query data in motion," said Forrester Research analyst James Kobielus.
System S has many characteristics found in complex event processing tools offered by the likes of Aleri (which recently acquired Coral8), Progress Software, StreamBase, Tibco and others, Kobielus said. It should particularly appeal to customers with IBM's software stack. Over time, it has the potential to integrate live streaming with IBM's data warehousing and dashboard and analytics tools obtained from its Cognos acquisition last year.
While IBM has indicated such integration is planned, company officials declined to elaborate. In its initial release, System S can link to the company's DB2 database server, WebSphere Front Office and solidDB caching infrastructure and will run on Linux-based blade servers.
Key to System S is its underlying programming language, called SPADE (Stream Processing Application Declarative Engine). SPADE is designed to support the programming of streaming apps and maps those streams to multiple targets, such as multicore platforms and specialized high-end systems such as IBM's own Cell processor.
Rather than a monolithic procedural programming language, or even a rules-based declarative model, System S offers an incremental step-by-step pipeline approach, said Nagui Halim, IBM's director of stream computing.
"It's almost like an assembly line but the difference here is you are not adding work at each point in the assembly line but you are rather examining the data and transforming it and passing it along to the next stage," Halim said.
Any skilled programmer can learn the SPADE language without a steep learning curve, he added. "We have lots of cases where people have picked it up quickly and become productive fairly rapidly," Halim said.
IBM is now making the trial code along with adapters, programming and testing tools available free of charge to developers looking to evaluate the software.
The company counts TD Securities as a customer that has deployed the software for a high-speed automated options trading system.
Among other selected customers IBM has worked with are the Marine Institute of Ireland, which monitors large volumes of acoustic data underwater and processes it in real time, and the University of Ontario Institute of Technology (UOIT), which monitors biomedical data including heart rates and respiration rates.
Starting configurations will be priced at $400,000, with a developer system pegged at $100,000.