Just like the question about where a company should focus its Big Data investments, the question about which data it should collect depends on the company and the problems it wants to solve. Again, no one-size-fits-all approach exists.
Nonetheless, two inescapable trends in Big Data provide guidelines for the overall types of data that companies should be looking to gather:
- Unstructured data: Many great insights to be derived from Big Data are likely to come from such sources as digitized video and audio, sensor data, email, documents, and the unformed text that fills Facebook, Twitter and other social media websites. Digital data from sensors and other remote devices attached to products – a GE aircraft engine or a Xerox copier, for example – enable companies to track their products long after they have been delivered to customers. This provides a major opportunity for companies that can collect unstructured sensor and other data from their products.
Consider the airline industry. An estimated $284 billion is wasted annually in the airline industry by inefficient fuel management, unscheduled aircraft maintenance, flight delays and other issues. Imagine how a company like GE (whose engines are bolted onto 53% of the world’s wide-bodied planes1 ) that can help customers cut their fuel and maintenance costs, will no doubt boost market share and increase revenue.
- External data:This data exists outside the information systems of a company. It’s in the hands of customers, third-party data providers, suppliers, social media sites and other sources. Getting a fuller picture of customers requires companies to collect external data.
Imagine a chain retailer that can get real-time data on customers who are in motion (on their cellphone) and within five miles of their stores (the telcos have this data); who are about to make a major purchase (such as replacing a refrigerator, the records of which are owned by the manufacturer); and who are most influenced by reviews in Consumer Reports (which are, of course, in Consumer Reports’ online archives). For that chain retailer to get the customer to visit the store near that customer and make a $2,000 purchase, it needs to tap external data held by the telco, the appliance maker, and Consumer Reports – and in minutes or perhaps even seconds before she moves to another store. The retailer’s internal data is not nearly enough to influence the consumer’s purchasing decision.
The leaders that we identified from our survey respondents – the companies with the greatest expected ROI in 2012 on their Big Data initiatives – are far more likely to recognize the value of both unstructured and external data than are the laggards. (See Exhibits VII-8 and VII-9.) On the dimension of structure, 55% of leaders’ data is unstructured or semi-structured vs. 45% of laggards. And 37% of leaders’ data is external vs. internal. For laggards, that percentage is 26%.
Exhibit VII-8: How Leaders Differ From Laggards in Usage of Structured and Unstructured Data
Q8: Mean Estimated % of Structured, Unstructured and Semi-Structured Data, Across All of the Company’s Big Data initiatives, Leaders V Laggards – IT/Big Data
Exhibit VII-9: How Leaders Differ From Laggards in Usage of Internal and External Data
Q9: Mean Estimated Percentage of Data that Come from Internal or External Sources, Across All of the Company’s Big Data initiatives
Leaders V Laggards – IT/Big Data
These differences can be seen in the importance that leaders and laggards place on internal vs. external and structured vs. unstructured data. (Exhibit VII-10.)
Exhibit VII-10: The Importance of Types of Data for Leaders and Laggards
Big Data Study Implications & Recommendations