If your customer base includes organisations as well as individuals – for example, academic institutions, hospitals, companies, public or government bodies – there are some special challenges when it comes to integrating data to achieve a single ‘master’ view. Here we list the key issues and describe some of the possible solutions.
1. Identification is difficult. Unlike individuals, where there is fairly reliable information such as email address and personal name to use in identifying contacts and linking separate data sources together, organisation names often vary between different systems. For example, in one database we might have “University of Oxford” as the customer’s name, but in another it is abbreviated to “Oxford Univ.”, and in another we have “Bodleian Library, Oxford University”. A related problem is that other useful ‘key’ values – such as customer or subscriber IDs – may sometimes be different for what is really the same organisation.
2. Organisations can be related to one another. Many of the organisations within your customer list will have affiliations. For example, university departments and faculties ‘belong’ to an academic institution, hospitals can be associated to universities, large companies often have a global HQ plus branch offices in countries around the world, governments have departments etc. For just one organisation, a complex hierarchy of ‘parent’ and ‘child’ relationships can exist, and it may of course be very important to understand who exactly you are talking to/selling to in this scenario.
3. Individual contacts may have organisational affiliations. Part of achieving a single ‘master’ view includes knowing if you have any individual contacts who are affiliated to larger organisations. But how do you reliably infer these connections if individuals may have provided inconsistent (or entirely missing) information about their organisation?
Some approaches that can help in addressing these issues:
1. Reference data. Having a central reference point for the identification and naming of organisations, and for defining the relationships between them, is clearly an important step. The Identify database from Ringgold is the largest and most well-known reference data source of this kind (and we recently announced a strategic partnership with Ringgold for this reason). Other related initiatives are the WorldCat Registry web-based directory for libraries, and NISO’s I2 (Institutional Identifiers) Working Group, which aims to establish a standard for naming and identifying organisations.
2. Automated tools. Software that utilises data normalisation and frequency analysis techniques can be used to make inferences about organisations based on the ‘free text’ names which have been entered. It is also possible to connect individuals to their organisations based on their email domain, for example linking people with ‘ox.ac.uk’ emails to Oxford University. Note however that these approaches have their limitations, and must be combined with a manual/editorial review process to check and correct the resulting output. Machines alone cannot solve all of the problems!
3. Do it yourself. If you have the staff resources and the time, you may be able to address some of the challenges of organisation data in-house, using a combination of automatic analysis and manual checking. This is a labour-intensive approach (at least initially) and is probably best suited to cases where a relatively small number of organisations are involved. It also makes sense to take a top-down approach, tackling your largest and most important customers first.