This project focuses on detecting fake news articles using Natural Language Processing (NLP) and Machine Learning.
It leverages the Bag-of-Words model, TF-IDF vectorization, and a Naive Bayes classifier to distinguish between True and Fake news.
- Preprocessed over 45,000 news articles (True + Fake dataset).
- Implemented text cleaning (tokenization, stopword removal, lowercasing).
- Converted text to numerical features using CountVectorizer + TF-IDF.
- Trained a Multinomial Naive Bayes classifier with ~97% accuracy.
- Built a pipeline for easy training and evaluation.
- Allows custom news input to predict whether it's Fake or True.
- Python 3
- Pandas, NumPy β Data processing
- NLTK β NLP preprocessing
- Scikit-learn β ML models (Naive Bayes, train_test_split, TF-IDF)
- Jupyter Notebook / VS Code
- Accuracy: ~97%
- Precision/Recall/F1: High balance across both classes (Fake & True)