One of the realities of working with data in large quantities and from different systems, as we do at DataSalon, is that it’s often not as complete or as accurate as one might wish. Fortunately we’ve got lots of ‘magic tricks’ up our sleeves to help our clients to address these data issues.
Of course, we can’t really conjure up data out of thin air, but we have developed lots of ways of filling in the gaps from the data that is there. These include:
- Using email addresses to add in country data (from the country code top-level domain) or affiliation (where organizational email addresses are supplied).
- Inferring a country field from the text in another field – usually the institution name, which may contain helpful clues to its location.
- Linking up data into a single customer view – for example, a user may enter very limited information when registering for alerts, but it might be possible to plug those gaps by matching their email address up against a fuller source such as sales data.
- Matching against a reference dataset – if customer data can be automatically matched (‘automatched’) against datasets such as Ringgold or ROR, then these can be used to fill in missing location information.
- Making unwieldy fields usable – this might be splitting a full name field into first and last names (while dealing with special cases such as double-barrelled names) or separating an address field into institution name, city, state, etc.
These aren’t disparate tools but are employed in conjunction with one another, making them even more powerful. For example, adding missing country information will enable more accurate automatching, by removing ambiguity between institutions of the same name in different countries. And automatching can in turn correct errors in location fields, by using the curated data from Ringgold or ROR.
These tools can significantly enhance the quality of a client’s data, and their set-up and use comes as standard with both our MasterVision and our PaperStack services. Why not get in touch to find out more?