On posing a question, and finding the data

Almost everything generates a trail of numbers. Success rates, call out frequencies – if nothing else there’s always a money trail to follow. Sometimes figures need to be ferreted out, (I’ll come back to tailoring FOI requests later), but as I flexed my fingers and started hunting, it became clear that there are reams of data that have been made publicly available, from government and researchers.

A good starting point for Brits is the UK National Statistics release calendar.

I’d recommend at this point subscribing to a free RSS feed reader service, like Google Reader. Many pages will have a little orange RSS feed icon in the corner. (Click below for expanded image)

An RSS feed icon, earlier

For pages of data releases that don’t offer such feeds, you can often create a manual RSS feed using tools like page2RSS.
My current google reader bundle contains a couple of ready-made RSS feeds from the Oxford University Migration Observatory and the Metropolitan Police’s crime figures pages, and one which I created from the Home Office’s migration statistics page.

But whilst this method of subscribing to raw data feeds will certainly spark off ideas and keep you abreast of the sort of information that’s out there, it does leave you at the mercy of whatever information happens to trickle into your lap.

So, here’s the first key point I’ve picked up about data journalism thanks to the indomitable Paul Bradshaw: data is useless unless you use it to tell a story. Data is a tool, journalism is the point. Beautiful animations and infographics, whether hard-hitting or otherwise, are wonderful journalistic tools precisely because they can hit you with a story in the ten seconds it takes you to glance at the image. To tell a story with the same impact in ten seconds in print, you need a tabloid headline. Data can have the same rapid, punchy impact, without compromising itself.

This does mean that a more proactive data journalism process will begin a little like this:
1. Begin with a hypothesis, a question, or a possible story idea which you can test. Give it a time frame. E.g: How many independent bookshops have closed in London this year compared to last year and the year before? Which borough had the highest number of armed police call-outs this year? Or how much funding has been cut for people with disabilities under the coalition government?

2. Find the data to answer this question. This might mean sending an FOI request or three.  It might also mean amending your idea if you find the data tells a different story to the one expected.Much data is publicly available; some is generated by research bodies. Combine journalistic research methods – ring around and ask questions.

Let’s take my last suggested idea about funding cuts for people with disabilities. I could send out my own FoI requests but a quick trip to What Do They Know, a website logging FOI requests and their responses for public use, reveals that someone else has already done some of the legwork for me. Hurrah. Mr Paul Taylforth has sent a letter to 123 councils asking what their response has been to the government’s consultation on the closure of the Independent Living Fund.

In view of the data I’ve managed to find, my hypothesis or question may have to be tailored down. Since my question was about changes to funding for people with disabilities, I could ask how many councils don’t support the government’s plans to phase out the Independent Living Fund. One of my next posts will cover how I get on in attempting my first piece of data journalism, using this information and question.

For now, here are a few tips on tailoring Google searches and FOI requests to get the information you’re looking for.

A few tips for advanced google searches:

Google searches can be made to more efficiently search for data using filetype:xls  or filetype:pdf. Try adding

Try adding: site:gov.uk or police.uk or .uk or mod.uk or nhs.uk or ac.uk or org/org.uk (for charities and nonprofits), or sch.uk (most schools).

~ before a word will look for synonyms.

Clinical research groups tend to have “ccg” in the wider url. Search like so – inurl:ccg

Asterisks in a search can be replaced with anything. “revenue from * 2011” will bring up “revenue from mobile to 2011” or “revenue from gold in 2011

A few tips for Freedom of Information requests:

A helpful lecture from Heather Brooke, the Freedom of Information campaigner who was on the tail of MPs expenses long before the Telegraph broke the story, gave me a few pointers about getting the most from your FoI requests.

(Click to enlarge) An example of a good FOI request from Heather Brooke

1. Ask the organisation to send you the information in the format of an excel spreadsheet. It will make it much, much easier to deal with, and you’re within your rights to ask for this.

2. If they put you off with the excuse that it would cost too much to compile or photocopy and send you the information, ask to inspect the record. You’re within your rights to do this for most public bodies, from local government to LOCOG or the Metropolitan Police Authority. Here’s a good example of how to word a request asking for the information electronically first, but letting them know you’re happy to come in to inspect the record otherwise.

3. The cost limit for each FOI request is £600 for government department and £450 for all other public authorities – about 2.5 days work. This means that larger, more complex requests can easily take you above the threshold. One answer is to break your questions down – don’t ask more than a few in each request. Be specific in your questions and ask for the information with all the details – whilst leaving as little room as possible for the public body to say the information doesn’t exist in the form you’ve requested.

I sent an FOI request recently to the Health and Safety Inspectorate, asking for accident rates by year, for the last three years. Cue the standard response that the information didn’t exist in the format I’d requested and arranging it into that format would take them above the financial limit. I sent a record request making it clear that I was happy to receive the information in whatever format they held it, as long as it covered the period I’d outlined. I found that asking for the specific accident record for each incident uncovered a wealth of anecdotal evidence in addition.

4. Mention the 20-day statutory period within which you expect a response.


2 thoughts on “On posing a question, and finding the data

  1. Pingback: Winter break – using data in the newsroom | datacrumble

  2. Pingback: NHS waiting times story – Part 2: Pivot Tables | Learning how to play with Data

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s