Data journalism, computer-assisted reporting and computational journalism: what’s the difference?

Is data journalism more networked and open than computer-assisted reporting (CAR) and computational journalism? The differences are examined in a journal article in Digital Journalism by Mark Coddington of the School of Journalism of the University of Texas at Austin. He has developed four dimensions in his typology, based on his analysis of about 90 texts (academic and professional) about these forms of ‘quantitative journalism’. The four dimensions, each of which he presents as a range between two opposing poles, are:

  1. professional expertise vs networked information — how far is it the limited domain of ‘professionals’ (linked also to the norms and practices of traditional ‘professional’ journalism) vs a more open, networked approach involving ‘non-professionals';
  2. transparency vs opacity — how far does it disclose the processes, practice and/or product;
  3. targeted sampling vs big data — does it gather and analyse a sample (probably then relying on inference or causality to draw conclusions) or a more comprehensive data set or collection (probably emphasising exploratory analysis and correlation); and
  4. seeing the public as active vs passive — the first linked to a more participative, interactive vision of the public, and the second to a more traditional, passive conception.

Mark Coddington’s diagram provides a useful summary of this, and how he situates CAR, data journalism and computational journalism along these four dimensions:

Typology of data-driven journalism

How Mark Coddington characterises data journalism, CAR and computational journalism. From his paper: http://www.tandfonline.com/doi/abs/10.1080/21670811.2014.976400

In some ways, the main dividing line is between CAR and the other two. This is perhaps not surprising, given that CAR has been around much longer and so — almost inherently — is tied more closely to ‘traditional’ ideas of journalism. Data journalism and computational journalism, on this analysis, have more in common, but perhaps differ most clearly in two ways. Data journalism is characterised as more ‘open’ (transparent) than computational journalism, and as less ‘professional’ in its orientation — ie more networked and accessible to those who are not ‘professional journalists’. (Data journalism as the new punk, anyone?)

Most data journalists (plus CA reporters and computational journalists etc) are unlikely to be bothered by how their work is classified, as Mark Coddington notes — mentioning Adrian Holvaty’s “Is data journalism? — Who cares?” post. But it does matter to researchers. Why? Because, he explains, “these definitional questions are fundamental to analyzing these practices as sites of professional and cultural meaning, without which it is difficult for a coherent body of scholarship to be built”.

He adds that this is an initial attempt at classifying CAR, data journalism and computational journalism, in what is still an emerging and developing field. Also, his study relies heavily on research in the USA and Scandinavia. While much of his typology rings true to what I know of data journalism in the UK (and CAR and computational journalism, to a lesser extent), I wonder how far it might differ here, and indeed elsewhere.

My interest (apart from running an MA programme that includes data journalism) stems partly from having written about the development of data journalism in the UK in a chapter in Data Journalism: mapping the future, That is when I came to realise how far the emergence of data journalism in the UK drew on US journalism’s experience of CAR, trainers from the States etc — helped along by the arrival here of the Freedom of Information Act and the open data movement. I’ve also touched on this topic in discussion with a US journalist who said he saw not difference between CAR and data journalism.

Tweeting headlines for breaking news

Getting breaking news out quickly but also accurately has long been a key challenge for news journalism. Given the volume of news items that some news organisations publish on their Twitter feeds, and the time pressures involved — particularly for breaking news — it is perhaps surprising that more mistakes don’t occur.

This is one error that highlights what can go wrong — and also raises an issue about auto-tweeting published headlines. It came at the end of the trial (for fraud) of the former personal assistants of Nigella Lawson and Charles Saatchi. The Grillo sisters were found not guilty on 20 December 2013 — as the main Associated Press (AP) Twitter account accurately noted in its initial ‘BREAKING’ tweet.

Following up around 14 minutes later with a further tweet that linked to an AP story, however, it inadvertently cast Lawson and Saatchi as those cleared of fraud — in a case in which they had appeared only as witnesses (and they had not faced any charges):

The situation was particularly confusing because the story to which that inaccurate tweet linked was correct  — as the Twitter card preview showed (below). So although the wording from that headline would have made for a less effective tweet than the first ‘BREAKING’ tweet, it would have been accurate.

The mistake was corrected about 20 minutes later:

In general, news tweets work best when written specifically for the medium rather than simply replicating headlines written for a website story, say. But this is a counter-example in which tweeting the headline from the web version (as some accounts are set up to do automatically) would have ensured it was at least accurate.

Are data journalism and online engagement coming of age?

It’s more complicated than a one-word answer, of course, but data and online community work (developing communities and engaging users) seem to be moving from niche ‘extras’ to core essentials in much of journalism.

The word ‘data’ has been creeping into advertisements for reporters. “Experience of data journalism” in a vacancy on Health Service Journal and Nursing Times, for example. A reporting role at Times Higher Education asked for “skills to handle large data sets to identify trends and spot stories, and the ability to use the data to create news graphics”.

Data journalism and social media are not only for specialists

My point is that these are not specialist “data journalist” roles: breaking news stories lies at the core of both jobs. My colleague Paul Bradshaw offers two reasons why every journalist should know about web-scraping, a key part of data journalism.

Similarly, using social media in reporting — to find stories and sources, for example — is now an accepted part of the skill-set for most journalists, I hope. At least for those now entering journalism.

It’s no surprise that The Huffington Post UK, online-only of course, expects that applicants “will already be utilising and fully understand the power of social media to promote content” for a blogs assistant editor role. But — as with data — social media and engaging users online seem increasingly to be an explicit element.

Channel 4 News advertised for a political correspondent who would “use social media to maximise the impact of your stories and engage with our audience”, for example. A junior writer on The Sun’s Fabulous Magazine online will be “helping to manage our strong community of Facebook and Twitter followers”. A reporter on Farmers Weekly will be “using social media, such as Twitter, Facebook and forums, to engage with readers”.

Again, these are not specialist social media or community roles – but jobs that require skills and experience in these areas.

Specialist jobs growing alongside ‘integrated’ roles

Fortunately for those coming into journalism, specialised roles appear to be thriving alongside those in which online community, social media (and/or data journalism) are ‘integrated’ into reporting or other roles. Engaging communities and building networks lie at the heart of a new Thomson Reuters project — with *nine* new jobs — for example. Metro has been recruiting for a social media executive as well as a head of insight and social.

This picture of specialised plus ‘integrated’ roles is reinforced by two other sources. First, discussions at the news:rewired event last month, where data journalism and online communities were key themes. Many people were there to learn how to do things better, and/or to benchmark their (or their publication’s) own activities.

Jobs in interactive journalism and online

Second, it’s an impression consistent with the jobs gained by students from the first year of our MA Interactive Journalism (at City University London). One is working as a data journalist at The Guardian, for example – while two others there are in content coordinator roles in which community and social media are part of a broader brief that includes writing, editing and commissioning. Others again have gone on to more specialised web analytics and social media work – as well as more ‘traditional’ journalism jobs, reporting on a regional paper and sub-editing for a national newspaper.

PS: Anyone unconvinced by the importance of mastering online/digital skills should look at some current job advertisements. A business reporter at The Telegraph will be managing the flow and placement of web content. An assistant features editor at The Sun will be “keen to adapt to digital platforms”. “An interest in digital publishing/social media would be an advantage” for a senior editor at The Economist group. And so on. [NB The job ads on Gorkana will to be taken down at some point.]

It is also worth noting that data, multimedia and technology topped the list of skills in a survey about journalism training, undertaken by the Poynter Institute.

PPS: I have resisted expanding this post to take in another key area, mobile platforms (also a focus at news:rewired), where news organisations are expanding their activities. Nor have I mentioned the demand beyond journalism for people with a good grasp of data, social media engagement and online/digital skills more generally…

Missing bookmarks and links from your delicious network? Recover them using RSS

Delicious.com has killed its network — the social in social bookmarking — since its relaunch by AVOS. Well, put it in cold storage, at least.  But you can revive it yourself — to some extent — thanks to the power of RSS.

The network still seems to be operating, and you can see the links that people in your network are tagging (a key feature, for me, of the ‘old’ delicious) by subscribing to the RSS feed for what used to be a page.

Use this format, replacing ‘username’ with your own delicious username:

http://feeds.delicious.com/v2/rss/network/username

That should pull in the last 20 links from your network. Subscribe to the RSS feed in Google Reader or another RSS feed reader, and it should keep you updated.

But AVOS/delicious — lots of people would still like the network functions back on the site SOON!

Refining Twitter: how to filter out (or search for) tweets by specific keywords — using Tweetdeck

Using Tweetdeck, you can hide tweets if they contain words you specify — and, conversely, set up filters like a search, to show only tweets showing specific keywords. There are two main ways of doing this and, on the day of the iPad2 goes on sale in the UK, I’m using ‘iPad’ as the keyword to filter out or (Apple fans, please note) search for.

Filter out anything you don’t want to see from Twitter

One way is to set a filter to affect everything in Tweetdeck; this applies to all columns and accounts. In the settings, look for the Global Filter menu — and type in the relevant word(s). You can also filter out tweets by people and source. Farewell those unwanted updates from Foursquare or Paper.li, perhaps.

To filter out tweets from all columns/accounts, use the Global Filter

To filter out tweets from all columns/accounts, use the Global Filter

The other, more selective way is to apply a filter to a chosen column — which you can also use as a ‘positive’ filter to show only tweets as specified.

Filter columns for specific words in Twitter

Look for the row of icons at the foot of the column you wish to filter or search, and click on the filter icon (an arrow curving down to a line). Using the default settings that then appear, you can type in a word or other text to exclude. To remove a filter, click the ‘x’ to the right.

Use the column filter to hide tweets

Use the filter to hide tweets containing specific words

Use column filters to find relevant tweets

Finally, the small drop-down menus in a column filter also allow you to search for tweets containing specific words or other text — simply change the minus sign to a plus. This ‘positive filter’ can be a useful shortcut, eg to hunt down a tweet you glimpsed and need to find again, or quickly to show particular tweets or only those with links (filter for ‘http’).

Use a column filter to show only specific tweets

Use a column filter to show only specific tweets

You can also filter by name, source or time of tweets instead of text. The column filter provides additional flexibility when used with a search column, eg to remove (old-style) retweets from a search on a particular hashtag (filter out ‘RT @’).

Linking gets more specific at the New York Times: link to an individual paragraph or sentence

Users can now link to and highlight individual sentences and paragraphs in stories on the New York Times site, notes TNW Media:

“While it could be a tad complicated for an average reader, it’s a great tool for writers and bloggers who frequently link to NYTimes stories.
[…]
To simplify things, if you hit your shift key twice on a Times story, small icons appear next to every paragraph. Click on one of them and it’ll place the paragraph linked URL up in the address bar of your browser.

Using the Times’ new hyperlinking system might mean a little more work for the linker, but I like how it adds a new layer of specificity and clarify to a linked post. And it is definitely cool to see that the hyperlink is still evolving.”

Read more here [link]

iPad apps are our flagship newspaper products, says News Corp’s James Murdoch

James Murdoch highlights the revenue potential but also the risks of iPad apps, in an interview at the Monaco Media Forum: “Our flagship newspaper products are now the iPad apps,” Murdoch said, and they pose a greater risk. “The problem with the apps is they’re much more directly cannabilistic of the core print product than the web site.” He added, “People interact more. They don’t dip in and out. The key is to get the advertising yields” to be the same. Combine that with the lower production costs, and the business model for apps could be highly attractive.

Read more here [link]

Expensive, long-form journalism can be a hit online

Simplistic preductions about journalism and the internet are futile, and there’s evidence that good quality (more expensive), long-form writing attracts more hits online, says John Naughton in The Observer:

‘”Ah, yes,” say the sceptics, “but where’s the business model to support such expensive writing?” And here’s an interesting development. The online magazine Slate decided to allocate resources to encourage some journalists to produce long, long pieces – for example Tim Noah’s analysis of why there hasn’t been another 9/11-type attack. These pieces have attracted astonishing levels of reader attention, with page views in the 3-4 million range. And the editor of the New York Times magazine has made the same discovery. “Contrary to conventional wisdom,” he says, “it’s our longest pieces that attract the most online traffic.”‘

Article: Good journalism will thrive, whatever the format | Technology | The Observer

Read more here [link]

Providing the information you didn’t know you wanted — Google CEO Eric Schmidt on newspapers, monetisation and the semantic web

Snippets from a Wall Street Journal interview with Schmidt:

Says Mr. Schmidt, a generation of powerful handheld devices is just around the corner that will be adept at surprising you with information that you didn’t know you wanted to know. “The thing that makes newspapers so fundamentally fascinating—that serendipity—can be calculated now. We can actually produce it electronically,” Mr. Schmidt says.[…]

On one thing, however, Google is willing to bet: “The only way the problem [of insufficient revenue for news gathering] is going to be solved is by increasing monetization, and the only way I know of to increase monetization is through targeted ads. That’s our business.”[…]

“As you go from the search box [to the next phase of Google], you really want to go from syntax to semantics, from what you typed to what you meant. And that’s basically the role of [Artificial Intelligence]. I think we will be the world leader in that for a long time.”

Read more here [link]