Opening Personal Data at Open Data Camp


This weekend I had the pleasure to attend Open Data Camp 9 in Manchester. Sadly I was only able to make Sunday, and by the sounds of it missed some great discussions on Saturday. I should open this by thanking all those who I had lovely conversations with and made me feel so welcome at my very first unconference.

In the last session of the day, Terence Eden led a talk on publishing personal data safely, the types of personal data and extent to which we publish them. This naturally started with posting personal news and information to social media, but grew to encompass topics like health data, fitness trackers and the privacy implications of publishing household or environmental data such as solar panel output or air quality meters. Terence has published details of his dental CT scans on his blog. It’s not entirely clear what bad thing someone could do with this (although Terence had a bite at it!), but the idea of publishing intimate health data was one that instinctively felt a bit odd.

Very good points were made around the distinction of publishing for friends and family, and for the general public - and the risk surrounding something in a “private” group being leaked more widely or scraped by a crawler. Also that time could be important - for instance a data feed showing low electric usage could be an indicator that your house is empty. By releasing data in arrears, the risk of it being used by burglars is reduced - the same principle as sharing holiday photos when you get home, not during the holiday.

At some point the comment was made about Samuel Pepys’ diary, and how this was something that provided uncommonly detailed insight into 1600s day-to-day life (albeit through the lens of a relatively wealthy individual). I made the observation that with local news on its knees, maybe publishing of personal information was something that was of long-term value to society, or at least to anthropologists - if it were preserved in some fashion for posterity (publishing doesn’t have to be online of course either).

When researching the history of Rugeley Rifle Club, I found that at times the local paper reported the club in almost excruciating detail - not just results or club events, but even individual scores in postal leagues - which even I as a shooter don’t care about! Terence observed that this might be a case of “of immediate short term interest, then nobody cares for a century, then historians find it really useful”.

Local newspaper column listing scores from a rifle club's weekly league match.
Rugeley Times 1982, reporting scores but not whether they won the round!

(As an aside, I was pleasantly surprised to learn from Sweyn Hunter that the The Orcadian is still going strong, and read by a majority of the population on Orkney. Shout out to Sweyn and Pauline who were the first people I spoke to, having arrived rather early on Sunday and they could not have been more welcoming).

Anyway, on the drive home I was mulling this over and concluded that maybe the “Local News Test” is a sensible standard for “what should I post on social media or publish to the internet?”.

Put simply - how would I feel if the post (or dataset/project) were featured in the local paper?

For personal data, that seems like a very sensible line to draw, and seems to work for most situations I can think of.

Shooting competition results? Yes, sure.

Births, deaths and marriage? That’s a staple - and publicly recorded in civil registers.

But would I want baby photos to be featured on a weekly basis? Intimate family photos? Probably not. That would be oversharing.

For quasi-personal data - e.g. environmental or air quality data - then it also seems to work, albeit possibly in arrears where privacy concerns (like a house being empty) could come into play. I do contribute data such as light pollution readings to projects such as Globe at Night and using the Dark Sky Meter app in somewhat realtime (though it doesn’t appear in their public datasets immediately).

This isn’t a particularly novel idea. It sits in much the same vein as the business rule “Would you want that email read out in court?”. But it is a new framing of the issue that I hadn’t considered before.

Thanks again to Terence, I look forward to ODC10 next year and encourage people to take the leap if they’re contemplating dipping their toe in the water.