Fake News, Conspiracy Theories, and Facebook
It is likely that fake news sites played a role in the recent US election. Web companies, particularly Facebook, have come under tremendous pressure to curb fake news. It seems disingenuous for Facebook to claim that it is not a media company and hence cannot take any responsibility for content posted on its site.
Yet, the solutions proposed by many are draconian: we do not want Facebook, or other private sector titans to decide for us what news we are allowed to see. Furthermore, many proposed solutions are impractical: there is no chance that we will be able to build a perfect fake news filter in the foreseeable future. To see this intuitively, consider the much easier problem of filtering email spam: such a filter has to separate spam, fake email messages, from real email intended for you. Such filters are in wide use today, and do a pretty good job. But, even after years of tuning, they are far from perfect. Some spam mail makes its way into my inbox and some real emails get diverted to the spam folder. Similar technology can be used to identify fake news, but the problem is harder, with fewer clues since there isn’t an explicit recipient and there usually isn’t a desired action (e.g. click on phishing link or purchase a male enhancement product). Indeed, people have already built such classifiers today. I haven’t had a chance to test any yet, but they couldn’t possibly approach the quality of commercial spam filters today.
One promising approach is not to classify news items directly, but rather to classify sources. You may have no way of judging on its own merits whether a statement I make is the truth. But if you have known me for a long time, and you know from experience that I am usually truthful, then you will believe my statement because I said it. We often have much more information about a source, and therefore can classify them more accurately than individual stories. New sources will have to build up credibility before they are believed, and we will have to create mechanisms for them to do so. If we make building credibility too onerous, then new sources can never gain credibility and become important; but if we make it too easy, then fraudsters will just repeatedly create new sources, do their damage, and move on. Devising well-balanced mechanisms is a great challenge for the tech giants to take on, and they have experience doing this in other spheres, as with user-submitted product and travel reviews.
Unfortunately, even the most honest among us will fall prey to falsehoods from time to time. Even if you correctly believe me to be trustworthy, I could be genuinely mistaken about some fact. For example, some years ago the entire country fell prey to fake news about non-existent weapons of mass destruction in Iraq. You could read about it in the most reliable of sources, like the New York Times. It is unlikely that a smart algorithm would have fixed this problem.
Even worse, we have errors of omission, biased reporting, unjustified extrapolation from an incident reported truthfully, and so on. Each of these errors can be just as troubling as straight up fake news. Reputable news sources usually try hard to avoid such errors. If they fail, they have a reputation to lose. Less reputed news sources have less to lose. It may be possible to algorithmically detect lack of balance in reporting or unsupported conclusions. But now we are going into algorithmic labeling far beyond identifying outright falsehoods. It will be even more difficult to do this right, and potentially more consequences for getting it wrong.
The bottom line is that conspiracy theories have been around forever. The web did not invent fake news or rumors. Their outsized impact today is because the web accelerated their spread. One man in Texas, with only 40 followers on Twitter, was able to start a rumor that protesters were being bussed to Trump rallies by the Clinton/Democratic organizaion. (He had actually seen busses for an unrelated event organized by Tableau software, which happened to be not far from the Trump rally). Enough people wanted to believe this story that it was seen by several hundreds of thousands (perhaps even millions) before it was debunked. Without technological acceleration, this one misinformed person could not have done nearly as much damage.
What this suggests is that technology can be used to avoid the accelerated spread of such stories. If we can slow down the spread, we can provide enough time for verification and for response. There are many technical options for how to do this. There is also room for innovation. Speed bumps are what we need to combat fake news and not censorship.