Can AI solve fake news?

Caspar Pagel
3 min readOct 10, 2022
Photo by Bank Phrom on Unsplash

It’s no secret that fake news and disinformation are threatening our society. How is any discourse possible when we can’t agree on a baseline of truth?

With the scale of today's (social) media, it’s impossible to keep track of everything manually. Clearly, the only way to solve fake news is through automation.

In the world of AI, fake news detection has been a hot topic for a few years now. While the technologies are advancing, there are still some critical challenges to overcome.

The context problem

Imagine looking at a tweet completely isolated, without any knowledge of the world.
Can you tell if it’s real? No. It’s just a bunch of phrases, words, letters.

Any single piece of information means nothing without context. That’s why we can’t simply train a machine learning model on a bunch of news labelled as real or fake.

One way to incorporate context is through stance detection. Here, we compare a claim, like a tweet, with another text. The model should predict the stance of the text towards the claim: Does it agree or disagree?

If the text, the context, agrees with our claim, we’re a step closer to the truth.
But how is the model supposed to know what makes texts agree or disagree each other?

From statistics to understanding

There are different approaches to comparing bodies of text.

First of all there is simple statistics. We count all the words and calculate, which ones are most important.
When both seem to have similar important words, we then say that they agree with each other.

This can lead to false assumptions. Comparing the words in “Tacos taste very good” and “Tacos don’t taste very good”, there are a lot of similarities.

Depending on our specific model, it might say that these claims agree, because it doesn’t really consider how language works.

Meaning: To understand context across texts, we first have to understand context within a single text.

In reality, we still don’t know where the border between statistics and text understanding lies.
Looking at recent models like GTP-3, I’m optimistic that, at some point, it doesn’t matter whether it understands text or just uses highly advanced statistics.

And there already are some pretty good results, even with relatively simple methods. It’s just much harder to apply these in the chaotic, unsterile environment that the internet is.

Multimedia irony

Now I’ve talked a lot about text, when in reality information is shared through all sorts of media. In fact, the trend goes towards simple graphics and short videos.
These are much harder to evaluate.

Additionally, there often are links to other sides. Adding sources to your claims is generally a good thing to do, but it adds another layer of complexity to the task of fake news detection.
Ironic, isn’t it?

Final thoughts

Detecting fake news with AI isn’t easy. Especially because we often can’t tell real from fake our selves.
There are many problems, like the ones I’ve mentioned above. Luckily, they appear solvable. For the most part, it just takes a ton of data and computing power.
It’s important that social media platforms and computing companies work together to provide these resources to researchers.

Thanks for reading and stay curious!

--

--

Caspar Pagel

A programmer interested in building a better world with AI, science and philosophy