
Defining 'big data' depends on who's doing the defining
At a recent Big Data and High Performance Computing Summit in Boston hosted by Amazon Web Services, data scientist John Rauser mentioned a simple definition: Any amount of data that's too big to be handled by one computer.
The most hyped technologies since
"Big data has to be one of the most hyped technologies since, then the last most hyped research, and when that happens, definition become muddled," says Jeffrey Breen of Atmosphere Technology Group.
The lack of a standard definition points to the immaturity of the market, says Dan Vesset, IDC program vice president of the business analytics division of the technology firm. However, he isn't quite buying the definition floated by AWS. "I'd like to see something that in fact talks about data instead of the infrastructure needed to process it," he says.
"It may not be all inclusive, nevertheless I think by and large that's right," says Jeff Kelly, a big data analytics analyst at the Wikibon project. Part of the idea of big data is that it's so big that analyzing it needs to be spread across multiple workloads, hence AWS's definition. "When you're hitting the limits of your research, that's when data gets big," Kelly says.
The cloud provides instant scalability
"The cloud provides instant scalability and elasticity and lets you focus on analytics instead of infrastructure," Amazon spokesperson Tera Randall wrote in an e-mail. "It enhances your ability and capability to ask interesting questions about your data and get rapid, meaningful answers." Randall says Rauser's big data definition is not an official AWS definition of the term, however was being used to describe the challenges facing business management of big data.
Big data analytics in the cloud is an emerging market even though, Kelly says. Google recently, for instance, released BigQuery, the company's cloud-based data analytics tool. IBM, in its turn, says information is "becoming the petroleum of the 21st century," fueling business decisions across a variety of industries moving forward.
Big data, IDC says, is a big market even though. According to IBM, IDC estimates enterprises will invest more than $120 billion by 2015 to capture the business impact of analytics, across hardware, software and services. The big data market is growing seven times faster than the overall IT and communications business, IDC says.
Role to play
"Both have a role to play," he says, and most large organization will likely use each. Relational databases will have some structured approach to the data, which will be used for organizations that have a large amount of data in other words subject to compliance or security requirements, for instance. Large scale collecting of data on an ad hoc basis is more unstructured and would take advantage of Hadoop computing clusters, he believes.
Network World staff writer Brandon Butler covers cloud computing and social collaboration. He can be reached at BButler@nww.com and found on Twitter at @BButlerNWW.
- · Rackspace debuts OpenStack cloud servers
- · America's broadband adoption challenges
- · EPAM Systems Leverages the Cloud to Enhance Its Global Delivery Model With Nimbula Director
- · Telcom & Data intros emergency VOIP phones
- · Lorton Data Announces Partnership with Krengeltech Through A-Qua⢠Integration into DocuMailer
