Q: We're a telecoms company looking to conduct regular acquisition mailing campaigns. However, our address management system, which uses the Postcode Address File (PAF) as its reference point, has identified more than one million records as being currently unmailable, because they do not match PAF. Can an improved address management system increase the size of our mailable universe?
Steve Clarke replies: The first thing to remember is that although PAF is a very important address dataset, it is by no means the be-all and end-all of good address management.
There are many sources from which prospect lists are drawn. These might be bought-in lists, your historical customer file, or a list of respondents to previous campaigns. These sources may include records which do not match PAF, such as flats, or house names, but will still reach their destination if sent through the postal system.
Those larger mailers, who are encountering this fall-off in mailable records because they don't match the PAF standard, could have a bureau build them a bespoke address management system. This would amalgamate all the company's available data sources, including PAF.
Software can then apply business rules through a model which automatically assesses the quality of an address (using house number, post town, postcode, and so on) and whether it is likely to be deliverable by a postal worker.
The assessment rules will vary depending on the various data sources that are being brought together to form the bespoke system.
Q: I'm in charge of customer retention for a utility company. How can I identify which of my customers are most likely to defect, and use this information to help better plan advertising activity?
John Evans replies: Managing customer churn is a massive problem within the utility sector. However, today's data techniques are more than capable of predicting which customers are most at danger of leaving. First, depending on the amount of additional data you hold on each individual, you may have to consider investing in some data enhancement to improve the information known about each individual. Ultimately, it's about finding key pieces of information about an individual that help to predict a future behaviour.
Running a basic lifestyle profile is sometimes a simple way of determining which bits of information are important, as well as offering extremely interesting pen portraits that describe your customers in a very non-data speak way.
While doing this, you should also consider who are your most valuable customers so that incentivisation is congruous with the likely levels of return that you could expect from them.
Finally, as utilities is an industry highly influenced by area, spatial analysis would quickly establish if there is any regionality in the losses you are suffering. If there is, the analysis would also provide a fantastic way of beginning the media planning process, helping to identify those areas that need a bit more support (whether it be at TV region, town or postcode sector level).
We have a large transactional database of approximately 1.6 million records collected from reader offers in our consumer magazines. Even though the database is relational we have only collected name and address, product categories and amount spent.
Q: We recently did a survey that collected more than 100,000 responders to a series of 'lifestyle' questions such as age, income and occupation. We now want to select data by key demographics. What is the best and most cost-effective way of gaining this key demographic data in all the remaining records?
Paul McCarthy replies: This is a common problem where a database was set up to fulfil a particular requirement so that only certain fields of information were originally captured. Now the database is needed for another use but only some of the records have the data you want to select on. Also, not all of those records have all the data fields fully populated.
You could initiate a communications programme for gaining this missing data; but even if you decide to do this for the future it will take considerable time and expense and will not give you data for immediate use.
To quickly create a fully populated database will involve a statistical solution of replacing the missing values by plausible estimates of their true values, a process called (missing value) imputation.
Off-the-shelf solutions from lifestyle database companies allow you to tag data across to your file for those individuals or households which are common to both databases. These databases have been pre-modelled from the known data available for you to then append the variables of data you want.
The other route is for a statistician to create a bespoke solution for your database. He or she will use the known values in the 100,000 survey responders to impute the rest of the data. There are a number of imputation methods available to statisticians, ranging from the simple to the complex.
The method that is adopted is a compromise between commercial constraints (time and money), theoretical rigour, and the realities and imperfections of actual data.
For a database of this size and the fact that you have a recent survey of real data, this could be the most cost-effective solution.
- To have our experts solve your data problems, send your questions to firstname.lastname@example.org.
The questions we publish will be anonymised.
Steve Clarke has more than 20 years' experience in the DM industry. He is commercial manager at CDMS.
John Evans is head of intelligence at Clark McKay and Walpole. He previously worked for Colleagues Direct Marketing.
Paul McCarthy is managing director of McCarthy O'Connor and MOC proMISS which he helped found 13 years ago.