Holidays without the headaches

holiday2It’s the summer holiday season – a time to get away, relax and forget about all those recurring data tasks you have lined up. If you have those jobs automated, you can happily switch off and forget all about them. If not, perhaps it’s time for a rethink before jetting off!

To ensure everything runs like clockwork while key members of staff are away, we always recommend that our clients set up automated and repeatable processes for delivering regular data updates for MasterVision. There are some very good reasons for this:

No lost luggage

When staff take vacations or are off sick, data still arrives at its destination on time, and in good shape. There’s no need for long and complicated handover notes, and colleagues won’t need to trawl through emails and post-it notes to work out exactly how a key data set needs to be formatted, generated and sent.

Switch to autopilot

Manipulating data ‘by hand’ is a very time-consuming process – the business of manually extracting the data, then zipping, encrypting and uploading numerous files can easily consume a large portion of the working day. Worse, it can also introduce errors into the files – we’re only human, after all.

Having automated processes means you can be sure that outputs are tried, tested and trusted. Most destination systems require a consistent format to guarantee processing – any inconsistencies can result in updates being rejected – or worse, being incorrectly integrated.

Travel off peak

Another benefit of automating processes is that they can be set to run regularly at a time when systems are less busy, such as overnight or at weekends. This means that intensive tasks such as data exports can be run without interfering with the performance of systems when staff are likely to be using them the most.

Home stretch

These recommendations come from firm proof that they work in practice – here at DataSalon, we strive to ensure that all the processes we have in place to check file updates and run scheduled site rebuilds on a regular basis are fully automated, and designed to make sure that data files are thoroughly checked before being incorporated into each site.

We strongly encourage our clients to do the same – after all, data analysis is only as good as the data itself, and with good processes in place you won’t be dealing with a mountain of data queries or customer complaints when you return from your relaxing break.

Sizing the academic journals market

???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????The term ‘market’ is often used in relation to academic journals (“what’s our market share?” and so on), but it’s not always entirely clear what that means, or how big ‘the market’ actually is.

It’s an important point: every publisher is trying to understand their current position and estimate new sales potential, and that requires a clear understanding of what we mean by ‘market’.

Obviously there isn’t going to be a single right answer, but at DataSalon we work with over 20 different publishers, so we’re able to put forward some ideas for discussion.

Defining “the market”

Let’s start by trying to define what we mean by ‘the market’ worldwide. There are two broad approaches you can take:

(1) every organization actually purchasing journal content in some form at present;

(2) every organization which might potentially purchase academic content in future.

Effectively (1) focuses on current reality, and (2) adds future potential into the mix. We’re going to take the easy route and focus on (1) – it’s more concrete and measurable.

What is a customer?

There’s a second important clarification to make regarding: (a) organizations which buy; (b) organizations with access.

Here, (b) is a wider group which might include departments and libraries which don’t buy independently, but which do still get access to journal content via a parent organization.

Let’s use (a) as our main focus – if you sell to the buyers, you’ll get the access points too (and if you can’t sell to the access points alone, they’re not potential customers in their own right).

So how big IS the market?

Having carefully refined our definition of the worldwide market for academic journals to “organizations actually buying journals content in some form at present” – how big is that market?

The answer is… not as big (or daunting) as you might think. Our information puts the actual size of that core market at less than 20,000. That’s a figure derived from real subscriber data from several large multi-disciplinary journal publishers: they all have under 20,000 active paying organizations (after de-duplication).

Clearly not every publisher is selling to exactly the same group of organizations, and there will always be a long tail of niche customers. Still, in terms of the core market common to most journal publishers, the figure of 20,000 seems to be about right.

So what?

As a publisher, that’s intriguing: your market penetration might be bigger than you thought it was, and the challenge of mapping the market and identifying potential new customers might not be as daunting as it seemed.

There are, of course, many different ways of approaching this topic, and different reasons for doing so. We’d love to hear if others out there have other definitions and/or methods of estimating the size of the academic journals market?

Pursuing the KISS principle

If you saw our blog article ‘Who ate all the pies?‘ a couple of months ago, you’ll know that we’re always on the lookout for new ways to make using MasterVision as clear and intuitive as possible for our clients.

Our latest quest, following the KISS principle (‘Keep it simple, stupid’), has been to simplify the search forms. This doesn’t mean removing information – far from it – but it does mean that what you do see is as clear and concise as possible. The changes we’ve made bring other benefits, too:

  • Some search fields aren’t relevant until a specific choice has been made; these can now be revealed only when needed.
  • Similarly, fields that shouldn’t be combined with those currently selected can be hidden when a particular selection is made.
  • Multiple alternative search fields, where only one is required, can be grouped into a single dropdown.

Less is more

So what does this mean in practice? Essentially, there’s less to see, unless you choose to see more. By making selections, further relevant search options are revealed.

A simple illustration of this is shown below – when ‘United Kingdom’ is selected from the ‘Country’ field, the search field for ‘UK Postcode Area’ appears:

country_code

To ensure two potentially conflicting search fields aren’t inadvertently populated, they can be grouped so that only one may be completed in a search.  One example of this might be where there is an option to search for USD or GBP subscription values (not both!):

value_select

A combination of these enhancements can be applied to a number of related fields, ensuring not only that the form is less cluttered with unnecessary information, but also that conflicting choices are avoided, giving you valid searches, and ultimately more meaningful results.

To quote Leonardo da Vinci: “Simplicity is the ultimate sophistication”. We hope you agree.

A pain in the SaaS

Software as a service (SaaS) is everywhere these days, with high-profile success stories like Salesforce leading the charge of hosted business tools. When we launched MasterVision back in 2006, the term “SaaS” wasn’t in common use, although that’s exactly what MasterVision has always been since day one.

Service with your software, madam?

The term ‘service’, though, has two distinct meanings – a technical one (on-demand, in the cloud) and a more traditional one (“how may I help you?”) – and in the latter sense the SaaS model can often be lacking. In fact, the off-the-shelf and one-size-fits-all nature of hosted software can sometimes lead to especially BAD customer service.

Like it or lump it

If you’re a SaaS provider, then uniformity is the key to success, and ideally you’d like every customer to use exactly the same features. Customizations are just expensive headaches: they complicate your code base, and distract your staff from selling and managing the standard service. That’s all good business, but it doesn’t make for a great experience if you’re a customer on the receiving end of that model.

We use several hosted services ourselves (such as Highrise and Onehub), and while they’re great at what they do, there’s little point in requesting changes. If you don’t like the way the software works, effectively you can lump it. Some larger vendors even go one step further in keeping customers at arm’s length, adding an extra layer of ‘implementation partners’ into the mix too (hello, Salesforce).

The sweet spot

At DataSalon, we try to hit the sweet spot for software as a service, with a large helping of old fashioned ‘service’ alongside all that lovely ‘software’. Partly it’s a benefit of being a smaller company, but we always ensure that all of the publishers we work with have direct access to our senior staff, significant input into our long-term product plans, and lots of scope for customizations whenever they need them.

Of course, not everybody needs that level of hands-on involvement, but next time you’re looking at SaaS options, remember that the second ‘S’ stands for ‘service’.

Who ate all the pies?

Here at DataSalon we’re constantly looking at how best to present data in clear and intuitive ways. Top of our ‘hit list’ was to come up with a better alternative to using the humble pie chart.

Whilst they are undeniably pretty, pie charts can often be unclear and sometimes downright baffling when trying to interpret the meaning of the data behind them. For example, can you tell from the chart below which journal gets the most submissions?

pie1

The problem here is that it can be difficult to judge the difference between segments unless there are only a few of them, and their sizes are easily distinguishable. Pie charts can be effective when displaying a simple “part-to-whole” relationship (e.g. 40% male / 60% female) but don’t work so well for anything more complicated.

So, what have we done? Well, since pie charts only really become useful when you label them, this essentially makes them an alternative way of displaying a series of labels and values. So – why not just use a ranked data table instead, along with a few visual enhancements to allow for an easy way to compare relative values. The same data presented in this new way is below:

pie2

… and it now becomes clear that ‘Employee Relations’ is the most submitted journal!

For those who can’t resist a good pie, we’re still keeping those as an alternative option, at least for now.

If you’re interested in discovering more about why pie charts are not great for visualizing data, we would recommend this excellent article by Stephen Few.

Auditing data quality the easy way

One of the first steps on the path to data quality enlightenment is to audit the quality of your data.  It’s useful to know the current state of play, to work out which data sources need your attention the most. There are a few different approaches you can take to auditing:

Manual audit

You will have staff who work closely with your data, and they will have a pretty good idea of where poor data quality might impact on their job performance. This might be customer service staff who, in looking up customer records, have a good feel for the level of account duplication. It could be marketing staff, who know that their response rates take a dive if they include email contacts from a certain source in their campaigns.

The point is that your staff already have a wealth of knowledge about how poor data quality impacts on their jobs. By working backwards from there, you can start to uncover some of the underlying data quality problems.

However, no single person or group of people can possibly have an in-depth understanding of all of your data sources and the quality of each one. This is where an automated auditing process can reap rewards.

Automated audit

An automated data quality audit has a number of advantages that will help you to understand the broader picture:

  • An automated audit can cover a lot of data sources at once, highlighting quality issues across multiple data sets and potentially millions of records. Many of our clients are already seeing the benefits of this type of large-scale audit.
  • Automation can also apply consistent checks to every source, resulting in a set of metrics or KPIs that you can use to get a good understanding of your overall data quality score. We use traffic light indicators and a system that takes account of priority fields within each source to make that score as clear and meaningful as possible.

overview_sm

  • If you want the detail as well as the overview, automated auditing and reporting can allow you to drill right down to see problem values in individual fields. Which emails are invalid and which names are junk entries are just two of the many types of data error we report on.
  • Automated auditing can also be repeated, so that you can track your data quality profile over time. We repeat the audit each month and provide twelve months of past data as standard in MasterVision DQ.

It’s important not to discount the manual approach to uncovering data quality issues, but to get a truly comprehensive picture of the quality of your data, an automated audit like the one offered by MasterVision DQ is the way to go. You can find out more about MasterVision DQ by taking the tour.

The data marathon

I’m currently training for a marathon, and have been putting off sorting out the tons of running data gathered by my GPS watch. I haven’t been practicing what I preach when it comes to data management, and my continued procrastination was holding back my performance.

I have a number of training routes, all measured out in my head. However, I’ve always wondered why my 10k races were never as fast as my 10k training runs. After finally biting the bullet and working through my logged data, I found that my training runs were actually 9.1k! My training strategy was based on flawed assumptions, and not informed by hard data. A sure-fire way to get disappointing results!

No pain no gain

I put a lot of effort into my running, but – just like in any area of life or business – neglecting the detail can put that end performance in jeopardy. It can be a pain to get the preparation right, and to trawl through the detail to find the best strategies, but it’s well worth it.

Tracking

So, how am I performing? Well, I could do better! Taking time out to see what my data is saying reveals a very inconsistent picture, and highlights a need to track my progress better, and monitor what my data is telling me. Is it working effectively for me? Are things improving over time? We all assume we know everything about ourselves (and our customers), but checking and tracking that data might just throw up some surprises.

It’s a marathon, not a sprint!

We all know that digging into this level of detail can be one of those jobs that gets put off in favour of other ‘sexier’ tasks, and it can take a lot of time – but giving data the attention it deserves will give you a better understanding of your performance, and help to inform future strategies to achieve more successful results.

Introducing our new tagline…

Recently we’ve been working on a new tagline – one which sums up everything we strive to achieve at DataSalon, on behalf of the many publishers we work with. We’re now pleased to unveil the result:

Better data. Better insight. Better business. 

We think this neatly captures what we believe in, and soon it will begin to appear on our website and other materials. Here’s a little ‘behind the scenes’ summary of our thought process in choosing this:

Better data

Every publisher is awash with data about authors, subscriptions, usage, and a whole lot more. But in order to make good use of it, all that data needs to be clean, correct, and trusted. This part of the tagline references the tools and expertise we provide to help solve those difficult challenges of data quality, data cleansing, and de-duplication.

Better insight

With MasterVision we help publishers turn information into insight by connecting up customer data from many different source systems into a single view. This is particularly important for management and marketing teams, who need quick access to a complete 360° view for every individual and institution, with tools which make it easy to search, segment, and visualise all of that information.

Better business

Everything we do revolves around supporting the bottom line for our clients. ‘Better business’ refers to our track record of helping publishers to mine their customer data to drive revenue: by securing renewals, identifying strong new sales opportunities, and supporting strategic planning with accurate information about broader trends in author, customer, and usage activity.

So there you have it. We’re really passionate about this stuff, and hopefully our  new tagline will help all of our clients (both present and future) to share this broader vision of what DataSalon is all about.

Our first webinar

Webinar title slideWe dipped our toes into the world of webinars recently, hosting a free session on the topic of data quality. This was the first webinar we’ve hosted and as such was a bit of an experiment and learning curve for us. So what did we learn?

On the plus side it was great to be able to address a global audience from the convenience of our office. The web truly does make the world a smaller place in some ways. Attending a webinar doesn’t represent the same commitment as attending a conference, seminar or even travelling to a meeting. Because of this we were able to attract attendees who might otherwise not have had the time to spare for a talk. We also think that the audience as a whole was more focused on the topic we wanted to talk about than might be the case at a conference where there is a variety of talks and speakers.

There were some challenges as well. As a speaker it was difficult doing a presentation over the web without the audio and visual feedback you would normally get from a ‘live’ audience. That certainly took a bit of getting used to; I couldn’t tell if anyone was laughing at my gags! We also weren’t sure how to field questions either during or at the end of the webinar, and so opted not to try – instead asking for questions to be sent through after the event. In hindsight I would like to have had the opportunity for more direct feedback and so we will consider how we might facilitate that for any future events.

Overall I enjoyed the experience and feedback from those who attended has been positive. We’ll be thinking about other topics that could make for an interesting webinar in the future, so watch this space.

Big plans for 2014

arrowWith the start of a new year inevitably comes some thinking about strategy for the next twelve months. We’re no different here at DataSalon, and in December we all sat down and had a good old chinwag about what we should focus on for the coming year. Here’s what we came up with as our ‘themes’ for 2014.

Data quality

This will come as no surprise to anyone who has been following this blog, but data quality will be a big theme for us in 2014. It’s high on our list as we’re passionate about doing our bit to raise the issue of data quality within publishing, because of the potential for it to improve the overall level of service and communication within the industry. We’re also rolling out our own data quality service MasterVision DQ to more and more of our customers, so further developing the scope and functionality of that module is something we will focus on early in 2014.

Customer identity

Customer identity is another topic that is close to our hearts. We’ve mentioned the developments underway with personal and institutional identifiers a few times on this blog, and we’re looking closely at this area in 2014. In particular we look forward to our clients making greater use of ORCIDs, and therefore feeding that data through to their MasterVision sites. We’re also looking closely at the ISNI identifier as an open and ‘bridge’ identifier for institutions. The ability of ISNIs to work in conjunction with other identifiers – and therefore potentially linking up individuals and institutions as well as connecting different metadata sets – is an exciting one. Watch this space, as there are sure to be some interesting developments in this area in 2014.

Single customer view and analysis

Lastly, we don’t want to lose track of what lies at the centre of our business – providing publishers with customer insight and intelligence via a single customer view. We’re not resting on our laurels here, and have some ideas about how to make the customer data integration already done within MasterVision even more useful for our clients. We don’t want to give away too much just yet, but we’re currently hard at work on some new visual reporting within MasterVision which will provide an even better understanding of customers, segments, and trends for our publishing clients.

Whilst we can’t predict the future, we can predict that 2014 will be a busy year for us, and we’re already hard at work on these new developments.

Follow

Get every new post delivered to your Inbox.