Ever since Man could count we have used data to make sense of the world around us, measuring phenomena through the correcting lens of statistics and facts.
The amount of data we’ve traditionally been able to collect and store has been comparatively small, and notoriously difficult to handle when swamped with too much of the stuff, which is why population censuses are still conducted only rarely. A traditional data analyst’s stock-in-trade was the classic database table, with its neat rows and columns, from which one could extrapolate meaningful insightful, all be it from a comparatively small sample that was sortarepresentative.
Jump cut to 2015 and data’s “small”, “neat” and “sorta” has been expunged by “big, “messy” and “spot-on”. And as the era of the ‘datasphere’ dawns, the definition of data has had to be rewritten and a new set of (super-sized) rules issued. We are learning to play the ‘data’ game afresh, and with the realization that it’s now made its way from being a back-room activity to front-page news.
Like many things born out of the Digital Age, we have once more been wrong-footed by something thought to be immune to re-invention. In this case, it’s data. Its rapid change in identity has brought with it a new set of possibilities to produce, collect and analyse information that’s proving to be nothing short of transformational. And the game-changer in all this is volume. Hence the word “Big”. In the same way that ‘social’ has become shorthand for describing how we interact with each other, “Big” similarly describes the relationship we now have with data.
To give Big some context, 90% of the data in the world today has been created in the last 2 years alone. That’s not just Big, that’s awesome!
Google alone processes 24 petabytes of data from 3 billion daily searches (a petabyte = 1,000,000 gigabytes) which is thousands of times the quantity of all printed material stored in the U.S. Library of Congress. In fact, if all the world’s current stored data were turned into books they would cover the entire surface of the United States some 52 layers thick.
Data has gone from being a comparatively small, static puddle of information to an ever expanding ocean of facts within a very short space of time. We are also about to be hit by another tsunami-like wave of unstructured data as the always “on” smart world envisaged by the Internet of Things becomes more and more of a daily reality.
In Part 2 of “Why is Data Now called Big?” we’ll take a look at big data’s game changing attributes of Volume, Velocity and Variety and see how these characteristics are helping us to better predict the actions of our colleagues and customers.