Big Data wouldn't exist if we had perfect tools. It is generally defined as 'datasets whose size is beyond the ability of our tools to process them within acceptable timeframes'.
Yet our tools are imperfect, so Big Data has become the new selling-ground for technology vendors. It's a little ironic that the fact their tools are imperfect lets them sell yet more tools.
Nonetheless, Big Data is very real. Social media creates huge volumes of new content. Sensor systems capture ever-more data about our traffic networks, our electricity grids, our industrial processes, and so on. Ninety per cent of the data ever created has been generated in the past two years.
Volume, velocity, variety
It's not just about size, however.
Big Data can be measured against three dimensions: volume, velocity and variety. Demands for faster response to events and queries increase the velocity with which we must handle data, and the variety of data sources and formats is constantly growing.
There are real benefits attached to this growth, such as the potential to segment customers more effectively, and hence tune products to their needs. It also gives us the ability to sense shifting customer sentiment and respond accordingly, increasing satisfaction and pre-empting competition.
Big Data also creates scope for new services. Rolls-Royce, for example, has remade itself - it no longer sells jet engines so much as hours of power - through the way in which it uses engine-management data to tune the operations of its customers' fleets of aircraft.
New tools, new strategies
So a host of new tools comes onto the horizon. Hadoop is the poster child.
It allows us to distribute processing across many computers, thus making Big Data tractable. The supporting cast includes Mahout, Hive, Pig and ZooKeeper, all of which support specific aspects of this processing.
Nonetheless, the real challenge in all this isn't about tools; it is to do with shifting our organisations to address the opportunities Big Data creates.
To handle Big Data, you'll need a bunch of technology - that's a given. With Cloud and the Hadoop stack, the basic tools are there. To succeed, you will also need to address:
- Skills - As well as business specialists (asking the right questions) and technologists (building systems), you'll need data scientists. These are the people who understand the algorithms for exploiting Big Data. They're in short supply.
- Attitude - The cycle for Big Data isn't 'plan then do', it's 'experiment, learn and evolve'. That's a research mindset, but one that is carefully linked to business objectives.
- Fragmentation - Most organisational data is spread across many disjointed databases. Relevant skills are often spread across multiple organisational units. You will need to integrate operations across these silos to get maximum value from Big Data.
- Valuation - You will get resources to do this only if you can demonstrate the value of the opportunity. Yet few organisations know how to value even the data they currently hold.
Graham Oakes is a technology consultant. He can be contacted via www.grahamoakes.co.uk or email@example.com.
His book Project Reviews, Assurance and Governance is published by Gower.