When it comes to topics like big data, the common “what’s in a name?” adage is something to consider. In order to communicate effectively and avoid confusion, it’s important to use terminology specifically and correctly. In his article “Defining Big Data for the Public CIO,” Bob Gourley predicts that if we’re not careful, the term “big data” will follow in the footsteps of other terms like “SOA” and “cloud computing” that have lost their true meaning as they’re used in mainstream communication.
For example, enterprise IT professionals have a very specific definition for big data. Contrary to popular usage, moving large amounts of data, plotting data, or storing a lot of information does not make a company a “big data” company. Instead, Gourley suggests that Wikipedia’s definition gets it right: “Big Data implies the need for a strategy for dealing with large quantities of data. The term is also used to describe the new platform of tools required to successfully handle sense-making over large quantities of data, as in the Apache Hadoop Big Data Platform.” According to this definition, strategy and sense-making platforms need to be the key elements in any discussions about big data.
Gourley claims that the big thing about big data is the “distributed processing of large data sets over clusters of computers enabled by the Hadoop framework.” As a result, the future-focused IT professional will want to leverage the abilities of the Hadoop software in order to stay competitive in the big-data job market.
What definitions have you seen for big data? Is the Apache Hadoop platform being included in the discussion? To read more about defining big data, read Gourley’s article at SmartDataCollective.com.