• Follow Curiously Persistent on WordPress.com
  • About the blog

    This is the personal blog of Simon Kendrick and covers my interests in media, technology and popular culture. All opinions expressed are my own and may not be representative of past or present employers
  • Subscribe

  • Meta

Data should be used as evidence and not illustration

I read the Guardian article on journalist’s struggles with “data literacy” with interest. The piece concentrates on inaccurate reporting through a lack of understanding of numbers, and the context around them. “Honest mistakes”, of a sort.

Taken more cynically, it is an example of a fallacy that I see regularly in many different  disciplines (I’m loath to call it a trend as, for all I know, this could be a long-standing problem) – fitting data around a pre-constructed narrative, rather than deducing the main story from the available information.

This is dangerous. It reduces data to be nothing more than anecdotal support for our subjective viewpoints. While Steve Jobs may have had a skill for telling people what they really wanted, he is an exception rather than the rule. We as human beings are flawed, biased and incapable of objectivity.

Given the complexity of our surroundings, we will (probably) never fully understand how everything fits together – this article from Jonah Lehrer on the problems with the reductionist scientific method is fascinating. However, many of us can certainly act with more critical acumen that we currently do.

This is as incumbent on the audience as it is the communicator – as MG Siegler recently wrote in relation to his field of technology journalism, “most of what is written… is bullshit”, and readers should utilise more caution when taking news as given.

Whether it is due to time pressures, lack of skills, laziness, pressure to delivery a specific outcome of otherwise, we need to avoid this trap and – to the best of our abilities – let our conclusions or recommendations emerge from the available data, rather than simply use it to illustrate our subjective biases.

While I am a (now no more than an occasional) blogger, I am not a journalist and so I’ll limit my potential criticisms of that field. However, I am a researcher that has at various points worked closely with many other disciplines (some data-orientated, some editorial, some creative), and I see this fundamental problem reoccurring in a variety of contexts.

When collating evidence, the best means to ensure its veracity is to collect it yourself – in my situation, that would be to conduct primary research and to meet the various quality standards that would ensure a reliable methodology, and coherent conclusions

Primary research isn’t realistic in many cases, due to limited levels of time, money and skills. As such, we rely on collating existing data sources. This interpretation of secondary research is where I believe the problem of illustration above evidence is most likely to occur.

There are two stages that can help overcome this – critical evaluation of sources, and counterfactual hypotheses.

To critically evaluate data sources, I’ve created a CRAP sheet mnemonic that can help filter the unusable data from the trustworthy:

  • Communication – does the interpretation support the actual data upon scrutiny? For instance, people have been quick to cite Pinterest’s UK skew to male users as a real difference in culture between the UK and US, rather than entertain the notion that UK use is still constrained to the early adopting tech community, whereas US use is – marginally – more mature and has diffused outwards
  • Recency – when was the data created (and not when was it communicated)? For instance, I’d try to avoid quoting 2010 research into iPads since tablets are a nascent and fast-moving industry. Data into underlying human motivations is likely to have a longer shelf-life. This is why that despite the accolades and endorsements, I’m loath to cite this online word of mouth article because it is from 2004 – before both Twitter and Facebook
  • Audience – who is the data among? Would data among US C-suite executives be analogous to UK business owners? Also, some companies specialising in PR research have been notoriously bad at claiming a representative adult audience, when in reality they are usually a self-selecting sub-sample
  • Provenance – where did the data originally come from? In the same way as students are discouraged from citing Wikipedia, we should go to the original source of the data to discover where the data came from, and for what purpose. For instance, data from a lobby group re-affirming their position is unlikely to be the most reliable. It also helps us escape from the echo chamber, where myth can quickly become fact.

Counterfactual hypotheses are the equivalent of control experiments – could arguments or conclusions still be true with the absence of key variables? We should look for conflicting conclusions within our evidence, to see if they can be justified with the same level of certainty.  This method is fairly limited – since we are ultimately constrained by our own viewpoints. Nevertheless, it offers at least some challenge to our pre-existing notions of what is and what isn’t correct.

Data literacy is an important skill to have – not least because, as Neil Perkin has previously written about, it is only the first step on the DIKW hierarchy towards wisdom. While Sturgeon’s Law might apply to existing data, we need to be more robust in our methods, and critical in our judgements.  (I appreciate the irony of citing an anecdotal phenomenon)

It is a planner trope that presentations should contain selective quotes to inspire or frame an argument, and I’ve written in the past about how easily these can contradict one another. A framing device is one thing; a tenet of an argument is another. As such, it is imperative that we use data as evidence and not as illustration.

sk

Image credit: http://www.flickr.com/photos/etringita/854298772/

Advertisement

Mediatel Media Playground 2011

My previous blog post covered my notes on Broadcast in a Multi-Platform World, which I felt was the best session of the day. Below are my notes from the other 3 sessions (I didn’t take any notes during the bonus Olympics session)

The data debate

Chaired by Torin Douglas, Media Correspondent for the BBC

Speakers:
Andrew Bradford, VP, Client Consulting, Media at Nielsen
Sam Mikkelsen, Business Development Manager at Adalyser

Panellists:
David Brennan, Research & Strategy Director at Thinkbox
Kurt Edwards, Digital Commercial Director at Future
Nick Suckley, Managing Director at Agenda21
Bjarne Thelin, Chief Executive at BARB

Some of the issues touched upon in this debate were interesting but I felt they were dealt with too superficially (but as a researcher, I guess it is inevitably I’d say that).

David Brennan thinks we need to take more control over data and how we apply it. There is a dumb acceptance that anything created by a machine must be true and we’ve lost the ability to interrogate the data

Nick Suckley thinks the main issue is the huge productivity problem with manual manipulation of data from different sources (Google has been joined by Facebook, Twitter and the mobile platforms), but this also represents a huge opportunity. He thinks the fight is not about who owns the data, but who puts it together

Torin Douglas posited whether our history of currencies meant that we weren’t so concerned with data accuracy, since everyone had access to the same information. Bjarne Thelin unsurprisingly disagreed with this, pointing out the large investment in BARB shows the need for a credible source.

David Brennan said his 3 Es of data are exposure (buying), engagement (planning) and effectiveness (accountability)

Nick Suckley thinks people would be willing to give up information for clear benefits but most don’t realise what already is being collected on them

Kurt Edwards thinks social media is a game-changer from a planning point of view as it sends the power back to the client. There is real-time visibility, but the challenge is to not react to a few negative comments

David Brennan concurred and worried about the possibility of social media data conclusions not being supported by other channels. You need to go out of your way to augment social media data with other sources to get the fuller picture

Bjarne Thelin gave the example of BBC’s +7 viewing figures to show that not all companies are focusing purely on real-time. He also underlines the fact that inputs determine outputs and so you need to know what goes in

David Brennan concluded by saying that in the old days you knew what you were getting. Now it is overblown, with journalists confused as to what is newsworthy or significant

Social media and gaming

Chaired by Andrew Walmsley, ex i-Level

Speakers:
Adele Gritten, Head of Media Consulting at YouGov
Mark Lenel, Director and senior analyst at Gamesvison

Panellists:
Henry Arkell, Business Development Manager at Techlightenment
Pilar Barrio, Head of Social at MPG
Toby Beresford, Chair, DMA Social Media Council at DMA
Sam Stokes, Social Media Director at Punktilio

The two speakers gave a lot of statistics on gaming and social gaming, whereas the panel focused upon social media. This was a shame, as the panel could have used more variety. All panel members were extolling the benefits of social media, and so there was little to no debate.

There was discussion about the difficulty in determining the value of a fan, the privacy implications, Facebook’s domination across the web and the different ways in which social media can assist an organisation in marketing and other business functions.

Mobile advertising

Chaired by Simon Andrews, Founder of addictive!

Speaker:
Ross Williams, Associate Director at Ipsos MediaCT

Panellists:
Gary Cole, Commercial Director at O2
Tamsin Hussey, Group Account Director at Joule
Shaun Jordan, Sales Director at Blyk
Will King, Head of Product Development at Unanimis
Will Smyth, Head of Digital at OMD

Ross Williams gave an interesting case study on Ipsos’ mobi app, which tracked viewer opinion during the Oscars.

Simon Andrews’ approach to chairing the debate was in marked contrast to the previous sessions. He was less a bystander and more a provocateur – he clearly stated his opinions and asked the panel to follow-up. He was less tolerant of bland sales-speak than the previous chairs, but was also more biased in approaching the panel with the majority of panel time filled with Simon speaking to Will Smyth.

Will King things m-commerce will boost mobile like e-commerce did with digital. Near field communication will move mobile into the real world.

Gary Cole pointed out that mobile advertising is only a quarter of a percent of ad spend but that clients should think less about display advertising and of mobile as a distinct channel. Instead, mobile can amplify other platforms in a variety of ways.

Tamsin Hussey said that as there isn’t much money in mobile, there is no finance to develop a system for measuring clicks and effectiveness of all channels. Currently, it has to be done manually.

Will Smyth said the app store is the first meaningful internet experience on the mobile. The mobile is still young and there is a fundamental lack of expertise at the middle management level across the industry. Social is currently getting all the attention (“Chairman’s wife syndrome”) but mobile has plenty to offer.

sk

Treating respondents as commodities

Treating respondents as commodities – don’t do it, kids.

Yet it happens, particularly with online surveys.

I recently had a sales call with a provider who said that their panel was no better or worse than any competitor; they sought to differentiate themselves via client management and survey aesthetics.

This experience is backed up by a pithy comment from Tom H.C. Anderson in his Linked In group (NGMR – I won’t link to it, since it is only visible to members) who said “There is really only one panel that is used by everybody. Counting panelists is like counting fish in the sea and or clouds in the sky. One day they’re being used by company X, the next by company Z & Y”

Online panels aim to be as representative as possible; thus there is little difference in their make-up and so companies compete on other grounds. Primarily, this seems to be price. This means providers are continually trying to squeeze more out of their respondents for less.

This has contributed to the commoditisation of sample (it is by no means the only reason – it is perhaps an inevitability given the need to maintain respondent anonymity and confidentiality) and the research process. The research experience is at best variable (at worst, terrible) for respondents.

Surveys are now analogous to Farmville – drones click on different parts of the screen to complete monotonous tasks for a tiny reward.

This has to change. Perhaps it will – two recent articles on Research Live have broached the topic

As a user of online panels, I know I am part of the problem. But it is the panel providers’ responsibility to protect its users. This would require coordinated action across the industry. Given that market research is regulated, this shouldn’t be an issue.

And the providers could start by treating their panel members as humans, and not commodities. Notwithstanding the inefficiencies of asking questions rather than capturing data (I’ve written about this previously in “If data is the new oil, we need a bigger drill”), some simple user experience testing could provide opportunities for easy, impactful changes.

For instance, why do surveys need to always ask demographic information? I’ve been stonewalled on this by several different companies, who say that it is “standard” (which sounds like commodity-speak) or that they need to ensure information is up-to-date. It is conceivable that a panel member may have changed their gender in the interim period between surveys, but I wouldn’t expect their ethnicity or age birthday to change. Cutting out extraneous questions can easily reduce survey length, and the burden on respondents.

This is a discussion the industry needs to have, and one I’m happy to be a part of.

sk

NB: I’m not concerned about whether they are called respondents or participants. Actions are more important than semantics.

Image credit: http://www.flickr.com/photos/baconandeggs/1490449135/

Spreading birthday cheer

Yesterday was my birthday. Among the birthday messages I received was an email from Stick Sports.

This is an online game that I hadn’t thought about for a while, let alone played. Yet they used the information I provided in my sign-up, to send me a message. This in turn has reminded me of the site (I haven’t gone back to play Stick Cricket or Stick Baseball yet, but I’m writing about it).

Some people might consider this an invasion of privacy since I didn’t give explicit permission for them to contact me. But it is an innocuous yet relevant message to me, that is extremely simple to administer. As such, I’m amazed more companies don’t do it.

For instance, the majority of emails in my inbox yesterday were Facebook notifications, informing me of friends writing messages on my wall. Although personal information is becoming more private, many people do have their birthdays visible. There is a great opportunity for brands or celebrities to send birthday messages to their fans, either to show they are there and listening, or to inform them of a special birthday-only offer. A simple, but effective means of communicating with supporters.

This is a ploy that can also be used for research panel respondents. For instance, why not give them additional tokens for prize draws on their birthday? It doesn’t cost anything and has the potential to improve their engagement with the panel.

sk

Image credit: http://www.flickr.com/photos/gizzypooh/539662773/

Avoiding insights

I really don’t like using the word “insight”.

As I wrote here, the word is hideously overused. Rather than being reserved for hidden or complex knowledge, it is used to describe any observation, analysis or piece of intelligence.

And so I’ve avoided using it as much as possible. In an earlier tweet, I referred to the Mobile Insights Conference that I’ve booked to attend as the MRS Mobile thing. And I even apologised for my colleague (well, technically, employer) littering our Brandheld mobile internet presentation with the word.

But this is irrational. I shouldn’t avoid it, if it is the correct word to use. After all, substituting it for words like understanding, knowledge or evidence might be correct in some instances, but not all.

Does it really matter? After all, isn’t a word just a word? As someone once said, “What’s in a name? That which we call a rose by any other name would smell as sweet“.

But he’s talking complete rubbish. Because words do matter. They cloud our perceptions. It is why brands, and brand names, are so important. And why blind taste tests give different results to those that are open.

In fact, this emotional bond we have with words has undoubtedly contributed to my disdain. And this should stop. So I vow to start reusing the word insight, when it is appropriate.

But when is it appropriate? I’ve already said that an insight is hidden and complex, but then so is Thomas Pynchon and he is not an insight.

In the book Creating Market Insight by Drs Brian Smith and Paul Raspin, an insight is described as a form of knowledge. Knowledge itself is distinct from information and data

  • Data is something that has no meaning
  • Information is data with meaning and description, and gives data its context
  • Knowledge is organised and structured, and draws upon multiple pieces of information

In some respects it is similar to the DIKW model that Neil Perkin recently talked about, with insight replacing wisdom.

However, in this model – which was created in reference to marketing strategy – an insight is a form of knowledge that conforms to the VRIO framework.

  • Valuable – it informs or  enables actions that are valued. It is in relation to change rather than maintenance
  • Rare – it is not shared, or cannot be used, by competitors
  • Inimitable – where knowledge cannot be copied profitably within one planning cycle
  • Organisationally aligned – it can be acted upon within a reasonable amount of change

This form of knowledge operates across three dimensions. It can be

  • Narrow or broad
  • Continuous or discontinuous
  • Transient or lasting

How often do these factors apply to supposed insights? Are these amazing discoveries really rare and inimitable, and can they really create value with minimal need for change? Perhaps, but often not.

And Insight departments are either amazingly talented at uncovering these unique pieces of wisdom, or they are overselling their function somewhat.

When I’m analysing a piece of privately commissioned work, a finding could be considered rare and possibly inimitable (though it could be easily discovered independently, since we don’t use black box “magic formula” methodologies). But while it is hopefully interesting, it won’t always be valuable and actionable.

But if it is, I shall call it an insight.

sk

Image credit: http://www.flickr.com/photos/sea-turtle/2556613938/

Reblog this post [with Zemanta]