U California Researchers Release Beta for Big Data Management -- Campus Technology

Data and Analytics

U California Researchers Release Beta for Big Data Management

By Dian Schaffhauser
06/10/13

A team of California universities has released a beta version of a system for managing big data along with more traditional forms of data. Researchers from the University of California in Irvine, Riverside, and San Diego have banded together to create AsterixDB, a Java-based "big data management system" (BDMS).

The work began in 2009 with funding from the National Science Foundation and, eventually, the state of California and others. The goal was to create a set of new technologies for "ingesting, storing, managing, indexing, querying, and analyzing vast quantities of semi-structured information." The researchers pulled ideas from three areas — semi-structured data, parallel databases, and data-intensive computing — to create a "next generation" open source application that could run on large clusters of commodity computers.

At the heart of the system, the AsterixDB engine operates on a "shared nothing" architecture. Each computer in the cluster runs independently and is self-sufficient.

"We're providing a next-generation platform for storing, managing, coordinating, and making use of Big Data," said Michael Carey, a UC Irvine professor leading the work. Big data is, of course, the output generated moment by moment by numerous online sources, including blogs, micro-blogging sites, transactions, sensors, status updates, and other computing activities. The challenge of managing that data with traditional database management technologies is that it is generated increasingly faster, takes multiple forms, and isn't easily categorized for rapid analysis.

According to an overview posted on the AsterixDB site, the work has targeted usage within multiple scenarios: cases where information is well-typed and highly regular (and predictably so) to situations where the content is textual, irregular, and therefore "hard to anticipate up front." Technical areas have focused on data storage and indexing that's highly scalable, query processing of semi-structured data on very large clusters, and the merging of techniques from parallel database processing and data-intensive computing.

"Big Data crosses a lot of domains, from government to health care to business," noted Carey. "It's hard for us to imagine an area where AsterixDB can't contribute."

Now the authors of the system are hoping to extend real-world testing by finding partners that can use the platform in various domains generating big data. Those environments may currently be using data management schemes based on Apache projects Hadoop, Pig, Hive, and HBase as well as MongoDB, among others.

"We're putting AsterixDB out in an unrestricted open-source form," Carey explained. "Users can do whatever they want with it, and we can learn from what they do and further improve our platform based on their needs."

About the Author

Dian Schaffhauser is a former senior contributing editor for 1105 Media's education publications THE Journal, Campus Technology and Spaces4Learning.

E-Mail this page

Printable Format

Featured

White House Releases National Policy Framework for AI

The White House has released a four-page AI policy framework aimed at setting a national approach to AI, with priorities including child safety, intellectual property protections, truth and accuracy guardrails, and worker training for an AI-driven economy.
It's Time for Higher Ed to Get Serious About AI Strategy

Without a coordinated strategy that involves multiple academic and administrative units across the entire campus, colleges risk wasting resources, duplicating efforts, and ultimately failing to deliver on the promise of deploying technology to improve learning and operations.
AI Adoption Forces Trade-Off Between Speed and Identity Security, Study Finds

AI adoption is forcing enterprises to trade security for speed — and identity controls are the first casualty, according to a new report from Delinea, a provider of identity security solutions for both human and AI agent identities.
Microsoft Intros 'Cowork' Feature for Copilot, AI Updates

Microsoft has announced a trio of AI updates, spanning Microsoft 365 Copilot, Security Copilot and Microsoft Foundry.