Resilience for beginners

For any business-critical system, resilience is important: the ability to maintain an acceptable level of service in the face of faults. When you hear about aircraft flying on after the failure of one engine, that’s resilience. For your business systems, if you wish to minimise down-time and disruption, then simply taking regular backups is not enough…

1. Backups. So backups alone aren’t enough, but backups are certainly essential: hard disks can and do fail without warning. Your data should be copied regularly, and stored somewhere away from your main system so they can’t be destroyed together. Even more importantly, you should test restoring your data – find out whether your backups really work correctly before you actually need them for real.

2. Hardware. Hardware is another potential risk: computer memory can develop faults, processors can overheat, and motherboards can fail. If that should happen, a good way to ensure a quick recovery is to buy and store all of the essential components and hold them in reserve ready for a quick fix. Even better, maintaining two or more complete systems means that you can quickly ‘fail over’ to a standby machine when necessary.

3. Power. Any single point of failure could leave the best laid resilience plans in tatters, and power cuts can and do happen. Many professional data centres will guarantee uninterrupted power through a combination of battery systems and diesel generators. For smaller systems, it’s worth bearing in mind that laptop computers have the advantage of built-in batteries, and are therefore more resilient than desktop computers when it comes to handling power cuts.

4. People. If all of your resilience plans have been masterminded by a single member of staff, will you be at risk every time he or she is away from the office or on holiday? Ensure that at least two people are fully up-to-speed with all of your resilience and backup plans for each system, and then co-ordinate their schedules and holiday plans so that at least one of them is readily available at any given time. You’ll be very glad to have a local expert on hand if things start to go wrong!

Five internet predictions for 2010

In this month’s feature, we list our pick of the topics tipped to dominate online technology news in 2010. All of these predicted trends point to an interesting and innovative year ahead for web based services.

1. Cloud computing. With the growing trend for remote working, either from home, on the move, or when travelling abroad, the ability to access your documents, data, and applications wherever you are is becoming vital. ‘Cloud computing’ refers to the move towards internet based services which enable users to do everything online, in contrast to traditional desktop applications which may restrict usage to your office PC. This will be further reflected by the launch later this year of Google’s ‘Chrome’ operating system, an alternative to Windows on which the only installed software will be a web browser.

2. Tablets vs smart phones vs netbooks. It is rumoured that Apple will shortly announce a new touch screen ‘tablet’ device, likely to to be named the ‘iTablet’ or ‘iSlate’, with Microsoft and others also planning similar products. Having larger screens than smart phones, but smaller and lighter than netbooks, tablets will aim to offer a new way of portably browsing the internet, and look set to cause a significant shake up in both markets. Apple’s device will also reportedly offer eBook support, making it a potential competitor to existing eBook readers.

3. Location-aware apps. When using the internet on the move, it is possible for your whereabouts to be determined automatically via GPS (Global Positioning System), mobile phone networks, or wireless access points. Location aware applications can make use of this information and respond accordingly, for example by showing your position on a map, or updating your status on a social network. It is predicted that the coming year will see some creative new uses for this technology, offering innovative services for users as well as opportunities for providers of targeted content.

4. Real time web. In the past, users have needed to check their favourite websites for updates, or wait for search engines to index content before it appears on result pages. The ‘real time web’ refers to a move towards immediate retrieval of information as soon as it is published, reflecting today’s demand for instant web content shown by the popularity of Twitter, Facebook and RSS feeds. Recently, Google has begun to integrate real time ‘tweets’, breaking news and other content into its search results, which may now give rise to the advent of real time SEO (Search Engine Optimisation).

5. Web 3.0. References to ‘Web 3.0′ as shorthand for the next phase of the web’s evolution are now beginning to appear more frequently in the media. There is currently no widely accepted definition, and Wikipedia has deleted its page for this reason, although it is often mentioned alongside ’semantic publishing’ to describe the use of metadata to provide additional context and integration between web documents. This lack of consensus over its meaning is unlikely to prevent its increasing use over the coming year, which could yet see a clearer definition emerge.

What does CRM really mean?

CRM (‘Customer Relationship Management’) is a broad term which can mean different things depending on who you ask and the context in which it is used. Systems described as ‘CRM solutions’ typically reflect this by addressing one or more of the areas listed below rather than all of them. When evaluating this type of software, it is therefore important to understand what you mean by ‘CRM’, and exactly what you want to get out of it.

1. Data cleansing, merging and deduplication. The quality of the data you hold about your contacts is key to managing a successful relationship with them. Many CRM systems offer functionality to validate data (e.g. checking of email addresses and postcodes), clean it up (e.g. normalising countries to a consistent form) and discard true duplicates in order to enable it to be used more effectively. This can be tricky to get right, and is a common stumbling block for many big projects driven by choosing a ‘new database’ as the starting point, while underestimating the challenges of migrating large volumes of existing and often messy data.

2. Single customer view. Often seen as the ‘holy grail’ of CRM, a single customer view allows you to see everything known about each customer in one place, and therefore gain a complete picture of your contacts. The best systems that offer this will integrate data originating from many different systems (e.g. subscribers database, alerts signups) and allow complex joining rules, including ‘multiple keys’ (such as email addresses, ID numbers etc) and fuzzy matching techniques. Since existing systems will each be serving a specific purpose, it’s usually most appropriate to add a single view on top, rather than trying to replace all of the systems you already have.

3. Analysis and data mining. Many CRM systems offer data analysis and reporting tools, enabling segmentation of customers for receiving targeted promotions. For example, contacts who have recently shown interest in your business but have not yet completed a financial transaction (e.g. newsletter subscribers) could be identified as ‘new prospects’ and offered special incentives to buy. Cross-selling (‘if you like product X, you might like Y’), up-selling (e.g. converting a one-off buyer to a subscriber) and re-activation (targeting lapsed buyers) may also all be possible. This segmented approach will typically achieve better conversion rates than ‘big blast’ campaigns that fail to take different customer groups into account.

4. Sales force automation. This type of CRM system enables tracking of all stages in the sales process, along with a history of all meetings, discussions, and proposals relating to a customer. When implemented across a large company, it can often help to manage and co-ordinate the activities of sales teams, which may be spread across many different regions or countries. The main focus of sales force automation is often the central logging of notes about customers and prospects by sales reps. Therefore, such a system is often just one of the components which will feed into a more complete single customer view managed elsewhere.

5. Campaign management. CRM software that offers campaign management is likely to include the specialist function of bulk sending emails. This needs to be implemented carefully to avoid being classed as ’spamming’, as well as responding correctly to ‘bounces’ and unsubscribe requests. In addition, the software will typically include functionality to plan campaigns, record message opens and track how many links are clicked, and provide statistics to enable you to compare the success of different campaigns, using metrics such as ROI (‘Return on Investment’).

How secure are your web services?

Online security has again been in the news recently with reports that thousands of logins for webmail services such as Hotmail, Yahoo, and Gmail have been compromised and details posted online. Here we list some key areas that both web service providers and internet users should always keep in mind to help protect themselves on the web:

1. Security updates. It is essential to keep up-to-date with the latest security updates in order to remain protected against new online threats. Users should accept and install automatic updates when offered (eg. Windows Updates) and ensure that they are running the latest version of their web browser, including any plug-ins it may use (eg. Java, Flash). Similarly, web services should ensure that software installed on their servers is fully patched and updated, as many hacking attempts will seek to exploit known vulnerabilities in older software libraries.

2. Password security. Since passwords commonly provide access to sensitive data and otherwise restricted functionality, they should be handled with utmost care by both applications and their users. A secure web service should encourage users to choose a ’strong’ password (eg. containing a mixture of uppercase/lowercase letters and numbers), and can limit automated guessing attempts by temporarily locking accounts which have too many recent failed logins. In addition, applications should always store passwords in an encrypted format, so that unauthorised access to them would not in itself compromise user accounts. Users themselves must also play their part: even the most secure application cannot guard against usernames and passwords being written on post-it notes stuck to monitors. Similarly, the same credentials should never be used to access different websites, as this reduces the security of each login to that of the least secure service.

3. Insecure channels. Logins to a secure web service should occur over HTTPS (Hypertext Transfer Protocol Secure), which is indicated to users by the presence of a padlock icon in their browser. Where HTTP is used (ie. no padlock visible), usernames and passwords are sent over the internet in plain text, meaning they are visible to anyone monitoring (or ’sniffing’) network traffic. Email is an equally insecure channel, and should never be used to share confidential data or send usernames and passwords together. As a result, forgotten password functionality that prompts an e-mail message containing reset details always needs to be implemented carefully. For any web application, it is key to note that information not readily visible to users is not necessarily secure: both hidden form fields and cookies may easily be accessed by a canny user, making them an unadvisable place to store passwords or other sensitive information.

4. Security holes. Web services should seek to protect themselves against a range of common security holes that may be used to gain unpermitted access to data or user accounts. A skilled user can often find ways of compromising security by tampering with the URL in the browser’s address bar, or by entering special values into forms, causing the application to output unauthorised information or deliver malicious content to other users. A secure web service will guard against this by correctly handling and sanitising all user submitted values.

5. Regular checking. With the next new threat always on the horizon, online security should be seen as an ongoing area of focus for both users and applications, rather than an issue that can be reviewed once and then forgotten about. For users, we’d recommend weekly system scans to ensure that desktops and laptops remain free of viruses etc, always ensuring first that the relevant anti-virus software is fully up-to-date. For applications that are constantly evolving and incorporating new technologies, regular checking and reviews can help to prevent vulnerabilities from being introduced, and ensure that their users can continue to use them with confidence.

Using your usage data wisely

There’s plenty of valuable insight to be gleaned from the usage stats for your online content. How people are accessing (or sometimes failing to access) your online materials can provide a lot of important intelligence about your customers:

1. The big picture. Usage data really starts to come to life when it can be presented alongside other key details and metrics for a given customer, such as a summary of their current subscriptions, revenue totals, subject interests, related contacts and so on. Immediately, this enables you to see where usage is high or low in relation to other key customer variables, and so to create targeted messages for selling more subscriptions and increasing usage.

2. Upward and downward trends. If historical usage as well as current data can be included, you can also derive comparative metrics: for example, how does current usage compare with the same time last year? Also, a month-on-month comparison might reveal consistent usage growth or decline, both of significant interest to sales and marketing teams. With that sort of trending data easily to hand, it becomes simple to identify those ‘at risk’ customers in need of attention.

3. Cost per download. Deriving a ‘cost per download’ figure can be as simple as dividing cost by usage for a given period of time. This can be extremely valuable when seeking to emphasise the value of your content, for example as part of a renewal campaign. Again, the benefit here lies in proactively protecting your existing revenue. If usage data can be linked to payment information, then it becomes simple to calculate this metric and include it in your renewal messages.

4. Turnaways. The term ‘turnaways’ refers to customers who are refused access to your online content (eg. due to the lack of a valid subscription). You might not currently be making use of this data, but if it can be tied to specific people – through login details or via IP address matching – then it becomes an extremely powerful sales tool. DataSalon is already working with IP data from Ringgold to turn previously anonymous turnaways into well-qualified lists of ‘hot sales prospects’.

Five reasons why hierarchies are hard

Understanding hierarchical relationships within data, such as those between individuals and organisations, can be key to gaining a complete picture of your contacts, but enabling users to view and explore these relationships in a user-friendly way is not easy. Here we list some of the main challenges in creating and representing customer hierarchies:

1. Categorisation. It may seem obvious, but when displaying a customer hierarchy it is vital to be able to show what each member of the hierarchy ‘is’: i.e. a person, an organisation, or a consortium (group of organisations). In practice, this may not be straightforward if a single, reliable ‘Customer Type’ field is not available within the underlying data. If a record within the hierarchy includes both a name (e.g. “John Smith”), and an organisation (e.g. “University of Oxford”), should this be seen to represent the entire university (where John Smith is the key contact) or only John Smith personally (who happens to be affiliated to that university)? In such cases a rule based approach may be needed in order to make a ‘best guess’.

2. Size and complexity. Hierarchies can often be large, with universities including separate faculties, colleges and libraries, and large healthcare authorities containing several subsidiary hospitals with their own internal departments. In addition, relationships within them may be complex, meaning that each person or organisation may not fit neatly into a single place within the hierarchy. For example, an individual may be affiliated to more than one organisation (having completed a degree at Oxford University and a PhD at Cambridge), organisations may belong to more than one consortium, and even the make-up of any given consortium may vary over time.

3. Terminology. When referring to hierarchical data, programmers commonly describe members of a hierarchy as ‘nodes’ and use family tree terminology to define the relationships between them, such as ‘parents’, ‘children’, ’siblings’, ‘ancestors’, and ‘descendents’. This convention may not always be familiar to non-technical users, and is often used inconsistently – for example, does ‘children’ refer just to the level below the current point in the hierarchy, or to all of the levels below that too? It may therefore be preferable to avoid displaying these terms to users wherever possible in order to prevent confusion.

4. Inheritance. Extending the family tree concept, the idea of ‘inheritance’ can be applied to define which properties should be passed downwards from higher to lower levels in the hierarchy. For example, it is likely that an individual will have access rights to content if the university they belong to has purchased a relevant subscription, and this should be reflected when representing their data. To create an accurate picture of each contact, it is therefore essential to understand which properties within the data can be inherited downwards and in which contexts e.g. from consortia to organisations, from organisations to subsidiary organisations, and onward to individuals.

5. Visualisation. Large and complex hierarchies are unlikely to fit into a single screen, and a successful visualisation should accommodate this. In contrast to a simple chart or graph for representing two dimensional data, a hierarchy is likely to require interactive elements such as scrolling, expanding and collapsing, and/or zooming in order to be viewed in its entirety. The most common representation is a Tree View which can either be vertical, as seen in Windows when browsing files and folders, or horizontal, such as in a family tree diagram or organisational chart. Other visualisation techniques include Tree Maps which show a hierarchy as a series of nested rectangles, and Hyperbolic Trees which display the hierarchy through a ‘fish eye’ view, although these are likely to be less familiar to users, and therefore harder to understand and use.

Tips for effective segmentation

Segmentation is an important marketing technique which involves targeting different groups of customers with different messages and offers. This is in contrast to simple ‘big blast’ campaigns, where the same message is sent out to every available contact. Make the most of segmentation with these simple tips:

1. Define your objectives. It’s important to start with a list of clear objectives, so begin by asking yourself what different types of customer behaviour you’d like to create more of: new signups, renewals, buying more products, buying different products, etc.? Ideally you’ll end up with a list of around 4-6 main objectives, each with a clear ‘action’ you’d like to persuade the customer to take.

2. Identify your variables. There’s little point deciding that you’d love to target customers with a high income, an interest in science, and with blue eyes, if you just don’t have that kind of information available. So, make a list of what’s known about your contacts, for example location (country, postal code), purchases (products, prices), recency (dates of signups, purchases), and so on. This will establish the key ‘variables’ you have to play with when defining customer segments.

3. Create your segments. Using these variables, you can now try to define an appropriate group of customers as targets for each of your objectives. For example, if you have an objective to sell more science materials to existing customers in the UK, you might create a segment of customers in the UK who are interested in science and who have previously bought from you. It’s important to create only a small number of segments (aim for less than 10), and to ensure they are not too small (not worth targeting), and also distinct enough to warrant a specific set of campaigns and messages.

4. Start communicating. Having identified your segments and objectives, you’re now in a position to begin sending appropriate, targeted messages to each group. (Note that this is more work than a single ‘big blast’ campaign, as there are now multiple offers and messages to manage.) It’s important to see each segment as an ongoing relationship, and establish a consistent style and tone of voice for each group over a series of messages, perhaps even assigning a different marketing manager to each segment.

5. Track the results. Recognise that you’re unlikely to get everything right first time, and try to see the entire process as a ‘work in progress’. Ensure proper tracking is in place, and monitor which segments and campaigns produce good results (and which don’t). Over time, you can then ensure that lessons are learnt, bringing improvements to the objectives, segment definitions, and communication styles you use.

Organising the organisations in your customer data

If your customer base includes organisations as well as individuals – for example, academic institutions, hospitals, companies, public or government bodies – there are some special challenges when it comes to integrating data to achieve a single ‘master’ view. Here we list the key issues and describe some of the possible solutions.

1. Identification is difficult. Unlike individuals, where there is fairly reliable information such as email address and personal name to use in identifying contacts and linking separate data sources together, organisation names often vary between different systems. For example, in one database we might have “University of Oxford” as the customer’s name, but in another it is abbreviated to “Oxford Univ.”, and in another we have “Bodleian Library, Oxford University”. A related problem is that other useful ‘key’ values – such as customer or subscriber IDs – may sometimes be different for what is really the same organisation.

2. Organisations can be related to one another. Many of the organisations within your customer list will have affiliations. For example, university departments and faculties ‘belong’ to an academic institution, hospitals can be associated to universities, large companies often have a global HQ plus branch offices in countries around the world, governments have departments etc. For just one organisation, a complex hierarchy of ‘parent’ and ‘child’ relationships can exist, and it may of course be very important to understand who exactly you are talking to/selling to in this scenario.

3. Individual contacts may have organisational affiliations. Part of achieving a single ‘master’ view includes knowing if you have any individual contacts who are affiliated to larger organisations. But how do you reliably infer these connections if individuals may have provided inconsistent (or entirely missing) information about their organisation?

Some approaches that can help in addressing these issues:

1. Reference data. Having a central reference point for the identification and naming of organisations, and for defining the relationships between them, is clearly an important step. The Identify database from Ringgold is the largest and most well-known reference data source of this kind (and we recently announced a strategic partnership with Ringgold for this reason). Other related initiatives are the WorldCat Registry web-based directory for libraries, and NISO’s I2 (Institutional Identifiers) Working Group, which aims to establish a standard for naming and identifying organisations.

2. Automated tools. Software that utilises data normalisation and frequency analysis techniques can be used to make inferences about organisations based on the ‘free text’ names which have been entered. It is also possible to connect individuals to their organisations based on their email domain, for example linking people with ‘ox.ac.uk’ emails to Oxford University. Note however that these approaches have their limitations, and must be combined with a manual/editorial review process to check and correct the resulting output. Machines alone cannot solve all of the problems!

3. Do it yourself. If you have the staff resources and the time, you may be able to address some of the challenges of organisation data in-house, using a combination of automatic analysis and manual checking. This is a labour-intensive approach (at least initially) and is probably best suited to cases where a relatively small number of organisations are involved. It also makes sense to take a top-down approach, tackling your largest and most important customers first.

Five common browser challenges

Hosted web-based services can provide many benefits, usually requiring no hardware or software installation, and enabling users to log in from anywhere whether they are using PCs or Macs. However, developing a browser-based application is not without its challenges, and here we list some of the most common:

1. Display quirks. The best web based solutions will work seamlessly across different web browsers and platforms. In practice, this means that pages should be developed in line with modern ‘web standards’ and tested thoroughly in all of the major browsers, including Internet Explorer, Firefox, Safari, Chrome and Opera. The challenge is that each is likely to have its own set of quirks that can cause the same layout to display differently, ranging from subtle problems (a pixel or two out) to completely broken pages (including overlapping columns and unreadable text). In some cases it may therefore be necessary to implement browser-specific rules in order to maintain compatibility.

2. Security. All services that can be accessed via the web should take security seriously, and protect themselves against common vulnerabilities. This includes careful handling of values entered into online forms, which may contain malicious code aimed at gaining unauthorised access to data. Effective access control is vital, and may provide the option to lock down the service to specific IP addresses so that only staff within your organisation can use it. Users themselves must also play their part by ensuring that they are running the latest browser versions, which will incorporate recent security updates from the browser manufacturers.

3. Page load times. Users expect web services to be fast, with Google searching and returning billions of results in seconds, and research showing that internet users will quickly give up and leave a site if they are kept waiting for too long. In addition to other benefits, making pages valid and accessible can also ensure they are lightweight, significantly improving load times over old fashioned ‘table-based’ layouts. Where a process requires time to run, it is helpful to provide a visual cue such as a progress bar, rather than displaying a blank or incomplete page until everything is finished.

4. The Back button. The Back button is an invaluable navigation aid for users, giving them confidence to explore and try out new features in the knowledge they can always retrace their steps. In dynamic data based sites this often presents a development challenge, as whilst moving users back, it does not automatically ‘undo’ their previous action. It is therefore common for sites to try to disable the Back button, by opening a new window that is missing browser navigation or by using tricks to make users stick at their current location. A better, but more difficult approach is to allow the use of Back but maintain state within the application, which can be achieved by using special hidden ‘tokens’ within links and forms.

5. User settings. Users may have a variety of different settings which can present design and development challenges. Since screen size may vary, a ‘liquid’ layout which expands and contracts to fit the available space is often more user-friendly, although this is invariably more complex to implement than a fixed layout where the width of the columns is always known. An effective layout will also allow resizing of text without breaking the overall design, and should not try to fix the text to a specific size, as this may cause accessibility problems for visually impaired and/or elderly users. Other users may have JavaScript disabled in their browsers, and key functionality should continue to work without it. As a general rule, individual preferences should always be respected, and a well designed web service will try to accommodate different user settings as widely as possible.

The art of deduplication

Deduplicating data effectively is a key part of building a single customer view, allowing a complete picture to be gained of everything known about your contacts. This can be more difficult than it sounds, often requiring a complex mix of both ’science’ and ‘art’ to achieve the best results. Here we list our top 5 tips for successful ‘merging and purging’:

1. Use email address – with caution. Since email addresses are normally unique to each individual, they can be used as a ‘key’ to deduplicate and integrate contacts between separate unrelated systems. However, many organisations use shared email addresses for internal purposes – for example, a staff member at an agency may register 100s of different customers under the same ‘admin’ email account. It is therefore important also to allow an exception list of emails that should not be used for joining.

2. Generate keys. Where email address is unavailable, and there are no other reliable ID fields to match on, contacts can often be deduplicated based on keys generated from a combination of other fields, such as first name, surname and postcode. The challenge is to find the right combination of fields for the data set in question in order to guarantee uniqueness. Where free text fields are used for joining, fuzzy matching techniques can be helpful to ensure that minor variations in spelling (e.g. ‘St’/'Street’ in address data) do not prevent a successful match.

3. Use multiple keys. Working with multiple keys at the same time is technically complex, but is essential for the best results. For example, a single customer can have more than one email address (work/home), name (before/after marriage) or address (previous/current) and may have provided you with various combinations of these over time whilst completing different online forms. Effective deduplication of this contact’s data should be able to pull all of this information together.

4. Exception reporting. The ability to create effective reports quickly is important, as this enables special cases to be spotted, reviewed, and added to exception lists in order to resolve them. A setup which allows fast review of results, flexible changes to keying rules, and rapid re-testing is usually much more successful than one which requires you to try to get everything right first time.

5. Expert knowledge. Managing a deduplication project requires a mix of both technical and ‘editorial’ skills, plus the experience to know which approaches are likely to be the most successful for a given set of data. Deduplication is often underestimated and undertaken as an in-house project or ‘minor’ task prior to loading data into a new system. However, in practice this can be highly complex, and you may achieve better (and quicker) results by outsourcing the task.