In his second article about running a data warehouse, Clive Humby looks
at what is worth storing and how best to use it
Computer power has increased enormously over the past few years, while
the cost has tumbled. But that does not mean that it is wise, possible
or cost-effective to store everything. Managers responsible for
investing in information for the business face a number of key questions
- issues which will affect what data is stored and what technology is
used. The main questions are:
* Is all the data necessary and relevant?
* How will specific data be applied to the business?
* Can each study pull off only that data relevant to the current study?
Having a data warehouse will only be of benefit if the data in it is
converted into information, and then used to create knowledge for key
decision-makers. It is of little value having the most advanced data
warehouse in the industry if you don’t have the correct people at the
decision-making level, both to utilise the information generated and to
support the warehouse’s maintenance and development.
Often it is not the detailed transaction data that is of interest, but
patterns in transactions, such as an increasing balance over time or the
range of products purchased over a period of time.
Before building your data warehouse, it is essential that your company
agrees at a senior level what the objectives and business needs of the
system will be. If this is not done, it is unlikely the systems people
will come up with what you want. Unfortunately, many organisations take
short cuts and end up with systems which are limited in scope and
difficult to build on.
A common problem involves merging marketing and IT expertise. The
marketing people know their stuff and there are plenty of computer
experts around, but very few people are well schooled in both. The
system has to be built to enable you to understand your customers
People too often think of a database as just a list of names and
addresses. But when you add relevant information to it, a list becomes a
database. A marketing database is relational in style, and running one
is a lot more involved than keeping a mailing list clean. A database
should imply something active, if not interactive.
The value of a database is that it enables you to start calculating real
customer value, and identify how much your customers are giving you in
terms of margins, longevity and loyalty.
Many companies do not go further than simple forms of analysis. To
understand what drives people to do the things they do, you have to
enter data-modelling territory and build predictive, rather than just
descriptive models.The ‘front end’ of your data warehouse must be easy
to use and intuitive for data exploration. If the data and its integrity
are vital, the interface and the ease of use will determine how much it
becomes central to the decision-making process of the organisation. Will
it be used all day every day or will it be jealously guarded by men in
The most obvious manifestation for the user will be the tools used to
access the data and generate the required information. These tools are
found, in various guises, in executive information systems (EIS), neural
networks, decision support systems (DSS), geographic information systems
(GIS), data mining tools, artificial intelligence (AI) and knowledge-
based systems. The key attributes required from them are:
* Intuitive interface
* Transparent access to data
* Support for a catalogue of information (metadata)
* Multiple query and analysis methods
* Multiple presentation styles for information.
So how do you start data mining with a data warehouse? You need first to
focus on the basics - cleaning up the raw ingredients, the data. This
will include: formatting names and addresses so all the files have the
same information in the same place, verification of addresses using Post
Office address files, and de-duplication of data within and across
The next stage is to consider adding external data, as long as they fit
the business objectives for the database. Consumer names can be matched
either to lifestyle or geodemographic data and census information. You
might also consider utilising available market research information.
All these sources can help to complete the picture of your customers,
but it would be dangerous to rely too heavily on them. You want to get
to know your individual customer. Both high- and low-value customers
live in the same areas and have similar lifestyles. Only your own data
can tell them apart.
The next question is: what sort of hardware or engine would be most
appropriate to analysing such vast amounts of data? Clearly, some
powerful analytical tools are required.
Massively Parallel Processing (MPP) is an appropriate hardware solution
where there are large volumes of data with great depth which need
processing in a limited time frame. But in some situations, alternatives
such as Symmetric Multi Processing or bit-mapped hashing indexes will
give high performance solutions.
Mainframe-based systems are limited by their processing power. Very
effective relational databases can exist on mainframe systems, but
complex queries can take days to write and execute. However, they can be
very effective in managing databases which have large volumes, but a
limited depth of data.
All types of decision support can be provided using ‘drill down’ or data
mining techniques. Data mining allows you to reveal layers of
information about markets (or subsets of markets) in ever increasing
detail, enabling customer or prospect profiles to be built and the
identification of segments onto which marketing activities can be
MPP is an excellent tool for data mining because unlike conventional PC
based or mainframe-based technology, it allows millions of rows of
information to be scanned in minutes.
So what is the payback from a properly designed warehouse with windows?
It allows retail chains, for example, to fine tune the trading strategy
of individual stores, gaining a vastly improved insight into the number
and types of lines stocked and the services provided, the best store
layout, and which offers should be directed at which customers and how
they should be communicated.
The potential for direct marketing also increases enormously, with a
profound effect on advertising strategies. In theory, heavy buyers of
particular products can be identified and targeted with attractive
offers. Big spenders can be offered more generous incentives, while
those who stop shopping at a store - perhaps having been tempted away by
a rival - can be targeted to win them back.
Targeting does not have to involve discounts. It could simply mean
making certain categories of customers aware of services or products
relevant to them, or inviting them to special events such as wine
A key aspect of the data warehouse is that the information is related to
customers, hence its importance in relationship marketing. Once the full
scope of the customers’ relationship with the organisation is fully
understood - such as what products or services they buy, how much and
how often - the business focus and marketing strategies become obvious.
With this comes confidence that the marketing activities being carried
out are the best possible for obtaining the highest sales over time.
Business objectives might include:
* Maximising the life-time value of your customers
* Performance indicators as the basis for decision-making
* Identifying which customers might respond well to different campaigns
* Assessing the impact of competitor launches either nationally,
regionally or locally
* Monitoring sales force productivity
* Tracking actual, rather than claimed behaviour
* Producing the most cost-effective solutions
* Responding rapidly to your competitors’ initiatives
* Maximising the opportunity of cross-selling to improve customer
* Using data warehouse information to drive positioning strategies.
Using the data, you can create virtuous circles by increasing spend on
successful sectors and reducing it or modifying campaigns in those which
are less successful.
Through segmentation, it is possible to answer the age-old question: how
much does the marketing department actually contribute?
Those who do not adopt these techniques not only cannot answer such
questions, they face a considerable threat from those competitors which
Clive Humby is strategic director at Dunn Humby Associates. The first
article in this two-part series appeared in the Choosing and Using
Direct Marketing supplement, Marketing, September 19.