Skip to content

TollisK/Fake-news-detection-Data-Mining

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Fake news detection

Implements a fake news detection program using classifiers for Data Mining course at UoA.

Description

The project is the categorization of text data by news articles and specifically the detection of fake news. The data contains 2 files in csv format (Fake.csv, True.csv)

Data Preprocessing

Removed punctuation and made all letters uniform after dropped every null row

Feature Extraction

To analyse the preprocessed data it has to be represented in a numeric format by using:

  • Bag of Words - one of the simplest word embedding approaches
  • TF-IDF is a bag words that applies a regularization algorithm.
  • Word vectors from Word2Vec model to create a vector representation for a sentence.

Classifiers

For every of the following classifiers there is a detailed analysis in the pytorch file

  • Logistic Regression
  • Naive Bayes
  • Support Vector Machine
  • Random Forests
  • Voting Classifier

Metrics

We evaluate performance of each method in test data using the following evaluation metrics:

  • Accuracy score
  • F1 score which is the weighted average of precision and recall and thus it is used especially for uneven class distribution problems.

Contributors

Apostolos Karvelas

Ioannis Papadimitriou

About

Implements a fake news detection program using classifiers.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published