Filtering fake news

YouTube identifies music and video based on an internal system called ‘ContentID‘. Google, Apple and many others have systems for recognising related images (you can use one of them directly within Google image search, by uploading an image to search against, or you can ask your iPhone to show you pictures of trees). I don’t wish to suggest that ‘finding things like an arbitrary image or video’ is a solved problem, but it’s clearly at least partially addressed.

Meanwhile, Snopes does an excellent job of checking and verifying (or debunking) stories which are doing the rounds of social media. PolitiFact won a Pulitzer. A round-up of fact-checking sites by The Daily Dot adds FactCheck.org, Media Matters, and others.

So… suppose you’re Facebook, looking at the wasteland over which you preside. Wouldn’t you want to do something like:

  1. Parse the message a user is about to post, looking for links or embedded media and extracting some sort of ID metric for that object.
  2. Check that content key against a modest number of sources, querying for a coarse trust score.
  3. Reflect that score back to the user prior to publication, with a link to the source article. For example: “You’re about to republish this image. Snopes thinks it’s likely a fake. Read more here [link]”.
  4. Allow the user to publish anyway, should they so choose.
  5. Perhaps also (and optionally) badge likely-fake items which appear in the user’s feed.

Would this open up a writhing pit of snakes about authority, editorial judgement and censorship? Sure. But Facebook and Twitter are already writing snake pits. It’s surely not beyond the wit of company execs to present this sort of approach as providing tools for users, and anyway, they already do most of what I’m suggesting: post a commercial audio recording, and YouTube or Facebook will flag it as such and (in the former’s case, at least) divert advertising revenue to the copyright holder.

That is: similar systems are already in place to protect copyright holders. What I’m asking here is for some of the same sorts of tools to be surfaced in the interests of asserting and maintaining moral rights. Such as my moral right not to be subjected to an endless stream of recycled crap, or our collective moral right not to accidentally render ourselves extinct as a population by doing something profoundly stupid just because somebody worked out how to make (transitory, as it turned out) money out of the process.

Put it this way: I think most of the people I follow would check their posts for validity, if only it was easy for them. So let’s do the easy bit.

The hard part, as best I can tell, is funding Snopes et al. to maintain the necessary APIs. It’s in music publishers’ interests to maintain databases of the songs over which they claim rights, because there’s a revenue stream to be had from the playing of those tracks. But… oh wait! Facebook is raking in advertising revenue. Ding!

In the end, the question boils down to: how much money is Facebook willing to spend on cleaning up their system? Their current dead tree media  buy is meaningless unless they’re actually building tools which help drain the swamp they’ve created. The objective here shouldn’t be rebuilding our trust in Facebook, it should be providing the tools which help us trust the media we’re seeing on a continuous basis.

I don’t think one can do that by asserting what’s ‘trustworthy’, there are too many value judgements involved. But one could provide access to datasets of what’s clearly bobbins – even for conflicting values of bobbins – and tools to apply those to our media streams.

I’ll trust Facebook when they give me tools to recognise and deal with the problem of fake news, not when they stick a poster on my bus stop asserting how much they care about the issue.