U California Researchers Release Beta for Big Data Management -- Campus Technology

Data and Analytics

U California Researchers Release Beta for Big Data Management

By Dian Schaffhauser
06/10/13

A team of California universities has released a beta version of a system for managing big data along with more traditional forms of data. Researchers from the University of California in Irvine, Riverside, and San Diego have banded together to create AsterixDB, a Java-based "big data management system" (BDMS).

The work began in 2009 with funding from the National Science Foundation and, eventually, the state of California and others. The goal was to create a set of new technologies for "ingesting, storing, managing, indexing, querying, and analyzing vast quantities of semi-structured information." The researchers pulled ideas from three areas — semi-structured data, parallel databases, and data-intensive computing — to create a "next generation" open source application that could run on large clusters of commodity computers.

At the heart of the system, the AsterixDB engine operates on a "shared nothing" architecture. Each computer in the cluster runs independently and is self-sufficient.

"We're providing a next-generation platform for storing, managing, coordinating, and making use of Big Data," said Michael Carey, a UC Irvine professor leading the work. Big data is, of course, the output generated moment by moment by numerous online sources, including blogs, micro-blogging sites, transactions, sensors, status updates, and other computing activities. The challenge of managing that data with traditional database management technologies is that it is generated increasingly faster, takes multiple forms, and isn't easily categorized for rapid analysis.

According to an overview posted on the AsterixDB site, the work has targeted usage within multiple scenarios: cases where information is well-typed and highly regular (and predictably so) to situations where the content is textual, irregular, and therefore "hard to anticipate up front." Technical areas have focused on data storage and indexing that's highly scalable, query processing of semi-structured data on very large clusters, and the merging of techniques from parallel database processing and data-intensive computing.

"Big Data crosses a lot of domains, from government to health care to business," noted Carey. "It's hard for us to imagine an area where AsterixDB can't contribute."

Now the authors of the system are hoping to extend real-world testing by finding partners that can use the platform in various domains generating big data. Those environments may currently be using data management schemes based on Apache projects Hadoop, Pig, Hive, and HBase as well as MongoDB, among others.

"We're putting AsterixDB out in an unrestricted open-source form," Carey explained. "Users can do whatever they want with it, and we can learn from what they do and further improve our platform based on their needs."

About the Author

Dian Schaffhauser is a former senior contributing editor for 1105 Media's education publications THE Journal, Campus Technology and Spaces4Learning.

E-Mail this page

Printable Format

Featured

Metaverse Org Declares the Technology Is Accelerating in Spite of Rise of AI

A new report from the Metaverse Standards Forum (MSF) declares the technology initiative is alive and well, despite skyrocketing attention paid to artificial intelligence.
Microsoft Intros Open Source Multi-Agent AI System

Microsoft researchers have unveiled a new open source multi-agent AI system, Magnetic-One, aimed to help enterprises automate complex tasks typically requiring human intervention.
New Amazon Nova Models Aim to Redefine Generative AI Performance

Amazon Web Services (AWS) has introduced Amazon Nova, a cutting-edge suite of foundation models (FMs) for generative AI.
5 Strategies for Democratizing Data to Enhance Student Outcomes

Data's role in enhancing educational outcomes is monumental, and it's time we harness this potential fully.

CAMPUS TECHNOLOGY NEWS

Email Address*Country*Select primary job title/function*

Please type the letters/numbers you see above.

U California Researchers Release Beta for Big Data Management

Featured

Metaverse Org Declares the Technology Is Accelerating in Spite of Rise of AI

Microsoft Intros Open Source Multi-Agent AI System

New Amazon Nova Models Aim to Redefine Generative AI Performance

5 Strategies for Democratizing Data to Enhance Student Outcomes

Portals

Artificial Intelligence

Cybersecurity

Data & Analytics

Learning Tools

Student Services

WEBCASTS

Top Risks of Waiting to Replace POTS Lines

From Applicant to Alumni: Integrating the Student Lifecycle into Identity Management

Getting Identity Right: Why Flexibility Is Key to a Modern IAM Solution

An AI-Native Network Puts Location Services at the Core of the Student Experience

Whitepapers

Keeping the Faith in the Cloud

The Faculty Guide to Getting Started with Gen AI

From Complexity to Clarity: Securing Cloud Environments in Higher Education

4 Causes of Student Disengagement (& How to Overcome Them)