Auto-clean your way to quality data

As a publisher you’re not alone in suffering data quality headaches – in fact any busy organization capturing large volumes of customer data is likely to experience similar issues. Without careful maintenance, inconsistencies and inaccuracies are inevitably introduced over time, thereby reducing the value of one of your organization’s most important assets.

The prospect of tackling the problem for thousands or even millions of records can be daunting. Few organizations have the resource or expertise to actively address data quality issues in-house, and to begin with a manual approach is not practical. But what if there was a way you could automatically correct common errors in your data on a regular basis, without any manual intervention?

The solution, we think, is to outsource the work to specialists who are able to put in place automated, intelligent processes to auto-clean your data.

Hands free

The value of having a set of automated rules in place to undertake tasks such as validating emails and correcting common typos should not be underestimated. Each email address corrected is a customer you can contact, and each fixed typo is a potential embarrassment avoided – you’d be surprised how many different versions of “University” exist in publisher customer data!

Clearly, not all problems can be fixed automatically – for example missing names or fictional but well-formatted email addresses – and some issues can be tricky to spot without human intervention. However, there is a long list of things that can be fixed – such as tidying case, splitting a full name into first/last name fields, and moving institution names from address fields into an institution field – which will significantly improve the overall quality and value of your data.

No waiting

Automatically cleaning your data also means you can roll out the benefits quickly, as rules can be applied to multiple data sources simultaneously. Compare this to relying on staff having the time or inclination to spot check and correct issues in each source system separately.

MasterVision DQ

We are now offering exactly this type of approach as part of our data quality module, MasterVision DQ. In addition to enabling you to audit and track the quality of your source data over time, this now also automatically cleans the data from multiple sources and reports back on the changes made, so that corrections can be applied in bulk to your own source systems.

We think that this type of automated data cleansing using established rules devised by experts, with experience of your customer data is key to improving the quality of your data, and ultimately the productivity of those responsible for it. It doesn’t have to be expensive or overly complex, and – best of all – you don’t have to do it yourself!

Related