How to Fact-Check using Natural Language Processing Techniques? A Literature Review

In this article, we present the summary of our research in the field of fact-checking. We categorized them in two categories, first are the closed source published applications and the second are the research projects done in this field.

Closed Source


Their methodology depends on human annotators to fact check a piece of the news and present a detailed report regarding the inaccuracies in the article

Reporters’ Lab

Their methodology depends on human annotators as well, and dataset can be found in and


Their methodology builds a fully automated fact checker, but no details are provided about the model and the dataset.

Research Projects

Automatic Identification and Verification of Political Claims


The model is composed of both convolutional neural networks and support vector machines. In order to get information to support or to refute a claim, they retrieved a number of snippets by querying Google. They did not select keywords but queried the search engine with full texts. The text of the claim of the most similar retrieved supporting texts were then fed into their model.

Another model they mentioned in their paper is the random forest model. In this case, both the Google and the Bing search engines were used to retrieve five snippets for a query consisting of the full claim. For each of the ten retrieved snippets, three features were computed:

1- the similarity between the claim and the snippet, calculated using word2vec embeddings,

2- the similarity between the claim and the snippet, calculated over the tokens, and 3-the Alexa rank of the website. These features were also combined, considering their mean and standard deviation.

The third method also retrieved supporting documents from the Web; in this case, they went further in trying to find the relevant fragments within the retrieved documents. Rather than using all the contents, they first compute the similarity between the claim and each sentence in the document and then they select those that pass a given threshold. The features for the supervised model are aggregations of the ones computed for each claim–sentence pair and include the stance of the sentence with respect to the claim and the degree of contradiction between the claim and the sentence, calculated at the term level.

The final method opted for an attention-based bidirectional long short-term memory network. Different from the previous approaches, in this case, no external information (e.g., no supporting documents) was used at all. Only the embedding representations of the claim itself were considered.

The best accuracy registered in this paper was for the first method.


They produced the corpus CT-FCC-18 that includes claims from the 2016 US Presidential campaign, political speeches and a number of isolated claims. In order to derive the annotation, they used the publicly-available analysis carried out by

ClaimRank: Detecting Check-Worthy Claims in Arabic and English


In order to rank the English claims, they used a neural network with two hidden layers. They provide the features, which give information not only about the claim but also about its context, as an input to the network. The input layer is followed by the first hidden layer, which is composed of two hundred ReLU neurons. The second hidden layer contains fifty neurons with the same ReLU activation function. Finally, there is a sigmoid unit, which classifies the sentence as check-worthy or not. Apart from the class prediction, they also need to rank the claims based on the likelihood of their check-worthiness. For this, they use the probability that the model assigns to a claim to belong to the positive class. They train the model for 100 iterations using Stochastic Gradient Descent.

For Arabic claims, First, they had to add a language detector in order to use the appropriate sentence tokenizer for each language. For English, NLTK’s sent_tokenize handles splitting the text into sentences. However, for Arabic, it can only split text based on the presence of the period (.) character. Next comes tokenization. For English, they used NLTK’s tokenizer (Bird et al., 2009), while for Arabic they used Farasa’s segmenter. They further needed a part-of-speech (POS) tagger for Arabic, for which they used Farasa, while they used NLTK’s POS tagger for English.


The run-time model is trained on seven English political debates and on the Arabic translations of two of the English debates. For evaluation purposes, they needed to reserve some data for testing, and thus the model is trained on five English debates and tested on the other two (either original English or their Arabic translations.)

Do you know that we use all this and other AI technologies in our app? Look at what you’re reading now applied in action. Try our Almeta News app. You can download it from google play:


1- Snobs fact checker

2- Reporters lab fact checker

3- Full Fact fact checker

4- Lab on Automatic Identification and Verification of Political Claims

5- Detecting Check-Worthy Claims in Arabic and English

Leave a Reply

Your email address will not be published. Required fields are marked *