Every Big Data study we have read has struggled with defining exactly what constitutes ’Big Data’. Is a chain retailer that crunches petabytes of structured transactional data from its point of sales systems processing ’Big Data’? What about a consumer products company that tries to discern consumer sentiments from the unstructured comments of millions of Twitter followers and Facebook fans?
Trying to define what is big data and what is not is a slippery slope. A major research report from the McKinsey Global Institute in 2011 addressed the problem of determining what is ’big’. It said the term Big Data is “intentionally subjective and incorporates a moving definition of how big a dataset needs to be to be considered “Big Data.” Thomas H. Davenport, a Babson College professor and acknowledged Big Data expert who has written prolifically on business analytics, views today’s big data activities and technologies as the evolution of technologies and techniques for analyzing computerized information that began in the mid-1950s1.Davenport calls the period from this genesis to about 2009 ‘Analytics 1.0’, and notes that it was characterized by a small number of data sources, structured data that came from within the company, and analytics activities that reported on prior events. Davenport says the ‘Analytics 2.0’ period began in 2010 (when the term Big Data was coined) and that it differed from the prior era in several fundamental ways: companies used a lot of external data, the volumes of that data were much larger, and they were often unstructured (not filling neatly into the columns and rows of a database). He believes companies just three years later are moving into what he calls “Analytics 3.0” – a period in which companies use huge amounts of structured and unstructured data, sourced both internally and externally, but especially to provide predictive insights.
We decided that the task of deciding what is ‘big data’ and what is not was highly subjective. As a result, we let the 1,217 survey respondents decide whether the initiatives they conducted in processing digitized data were ‘Big Data’ or not. We did offer a definition, however, from which they could decide whether they qualified to take the survey:
“The collection, processing and usage of large volumes of digitized data to improve how companies make important decisions and operate the business.” It is with this definition that nearly half the survey takers (47%) dropped out. Our research focused on the 53% of companies that remained – 643 in four regions around the world – that said they had Big Data initiatives.
« Previous Post
Big Data Study: Research Approach and Survey Demographics
Next Post »
TCS Big Data Survey Demographics