• Follow Curiously Persistent on WordPress.com
  • About the blog

    This is the personal blog of Simon Kendrick and covers my interests in media, technology and popular culture. All opinions expressed are my own and may not be representative of past or present employers
  • Subscribe

  • Meta

ABCe and the difficulties of auditing online metrics


As the recent influx of links have shown, I have struggled to keep my blog updated in recent weeks. This post has been saved in my drafts for close to a month now. While it may no longer be current news, the principles underlining the issues are still, and will continue to be, pertinent.

So, please cast your minds back to May 22nd, when it was announced that the Telegraph had overtaken the Guardian in terms of monthly unique users, and with it took the crown of the UK’s most popular newspaper website.

The figures were according to the ABCe – as close as the UK gets to officially audited web statistics. However, close is a relative term. The ABCes are still far from universally accepted and as can be inferred from the FAQs on their website, there are still many challenges to overcome. It will be some time before we can even approach the accuracy in audience figures for other above the line media (outdoor excepted).

To my eyes, the main issues surrounding effective online measurement can be boiled down to 3 broad categories.

Promoting the best metric(s)

metric hairclipThe biggest and most intractable obstacle. Which measure should be given most credence?

TV – the area I am most familiar with – also has a variety of measures. But average audience – across a programme, series or a particular timeslot – and coverage – the total number of people exposed to a programme/series/timeslot for a given time (usually 3 minutes) tend to be used most often.

Unfortunately, neither of these are fully appropriate for the web. So what are the alternatives? The main three are

  • How many (unique users) – but how unique is a unique user? Each visitor is tracked by a cookie, but each time a user empties his or her cache, the cookie is deleted. On the next visit, a new cookie is assigned. If I clean my cache once a week, I am effectively counted as 4 unique users a month. Plus there is my office computer, my blackberry, my mobile and my games console. I could easily be counted ten times across a month if I use a variety of touchpoints.
  • How busy (page impressions) – but how important is my impression? I may have accidentally clicked through a link, or I may continually refresh a page to update it. As for automated pages, such as the constantly refreshing Gmail or Myspace ? Is each page refresh counted as a new impression? Furthermore, if a page impression is used to calculate advertising rates, what happens to the impressions made with an adblocker in place?
  • How often (frequency – page impressions divided by unique users) – as this relies on the above metrics, it is heavily compromised

What about other measures?

  • Average time spent can be massively skewed by people leaving their browsers open while they aren’t at their pc
  • Average number of visits would give a decent measure of engagement, but the cookie issue would mean it would be understated.
  • Measuring subscriptions would be interesting, but these may be inactive, sites offer multiple feeds, and take-up are far from universal. As people become more adept with web browsing, RSS may gain more traction but websites such as Alltop are showing viable alternatives to the feed-reading system.

And beyond these concerns, there is still one crucial question that remains unanswered. Who are these people?

TV, radio and press use a representative panel of people to estimate the total population. For TV, the BARB panel consists of around 11,000 people who represent the 60m or so individuals in the UK. But we are seeing that as the number of channels increase, this size of panel isn’t able to accurately capture data for the smaller channels.

So what hope is there for the web, with the multitude of sites and sub-sites with tiny audiences? Not to mention the fact that these audiences are global.

Of course, online panels do already exist. But these only sample the top x number of websites, and, as it stands, the – differing -figures each of them produce are treated with caution and – on occasion – suspicion. Witness the argument between Radiohead and Comscore, to give one example

So I’m no closer on figuring out how we measure. How about what we measure?

Determining the content to be measured

greenshield stampsIf we are looking to determine advertising rates, then the easy answer is to measure any page that carries inventory. But should the quality or relevancy of the content be considered?

Sticking with UK newspaper sites, questions over what material should be audited include:

  • If we are looking at UK sites, should we only look at content orientated towards a UK audience? Should this content or audience be considered “higher quality”?
  • If we are considering the site as a newspaper, should we only look at current content only? For instance, the Times has opened up its archive for perusal. Should all of this content be counted equally?
  • How relevant to the contents of the news do the stats have to be? Newspapers have employed tricks from crosswords to bingo to free DVDs in order to boost their readership, but should newspaper websites be allowed to host games, social networking spaces or rotating trivia (to give one example) as a hook for the floating link-clickers or casual browsers?
  • How does one treat viral content, that can be picked up and promoted independently of the proprietor? See the story of the Sudanese man marrying the goat, which remained a popular story on BBC News for years, or the story about Hotmail introducing charges, which is brought up to trick a new batch of gullible people every year or so
  • What about if the internal search is particularly useless, and it takes several attempts to get to the intended destination?
  • And a tricky question to end on – can we and should we consider the intentions of the browser? For instance, my most popular post on this blog is my review of a Thinkbox event. Is it because it is particularly well written or interesting? No, it is because my blog appears when people search for a picture of the brain. Few of the visitors will even clock what the post is about; they will simply grab the picture and move on.

All of this makes me wonder how much of a false typology “UK Newspaper site” is in this environment. What proportion of visitors could actually be identified as being there for the news, and not because of clicking a link about the original Indiana Jones, or a funny review of the new Incredible Hulk movie

Could those articles have been approved purely for link-bait? As they also appear in the print editions, I think not. But I’m sure it does happen.

Accounting for “performance enhancers”

the incredible spongebob hulkIn the same way as certain supplements are permitted in athletics but others are banned, should some actions that can be used to artificially boost stats be regulated?

  • Should automated pages be omitted?
  • If the New Yorker splits out a lengthy article across 12 pages, can it really be said that it is 12 times more valuable than having it appear on one page?
  • Many sites now have “see also” or “related” sidebars. Should sites that refer externally be penalised for offering choice, against those that only refer within the site itself?
  • Search engine optimisation is a dark art, but there can ultimately only be one winner. While there are premium positions in-store and on the electronic programming guide, search engines have much more of a “winner take all” system in place where the first link will get the majority of the click-throughs. Should referrals be weighted to account for this?

There are a lot of questions above, and no real answers. No measurements are perfect, but we look to be a long way off approaching acceptability in the online sphere.

This is by no means my area of expertise, and I would love to hear from anyone with their own thoughts, suggestions or experiences on the topic. I will happily be corrected on any erroneous details in this post.


Photo credits:
Measurement: http://www.flickr.com/photos/spacesuitcatalyst/
Metric hairclip: http://www.flickr.com/photos/ecraftic/
Greenshield Stamps: http://www.flickr.com/photos/practicalowl/
The Incredible Spongebob-Hulk: http://www.flickr.com/photos/chris_gin/