This repository contains Python code for a Fuzzy Genetic Algorithm Spam Classifier. The classifier uses a combination of fuzzy logic and genetic algorithm to classify spam and legitimate messages. This Classifier is built using scikit-learn's TfidfVectorizer and PCA for dimensionality reduction.
Read the complete documentation HERE. Also you can read THIS code review for more details about the code.
- Clone the repository to your local machine:
git clone https://github.com/Ali-Pourgheysari/CI-phase-3-fuzzy-inference-system.git
- Install the required packages:
pip install numpy pandas scikit-learn matplotlib
- Make sure you have the dataset "SMSSpamCollection" in the same directory as the code.
- Run the main script to preprocess the data, apply Fuzzy Genetic Algorithm classification, and evaluate the accuracy:
python main.py
- The script will output the accuracy of the classifier and plot the fitness score over generations.
- Fuzzy Functions: Contains various membership functions like sigmoid, gaussian, triangular, and trapezius used for fuzzy logic.
- Rule Class: Represents a single rule in the Fuzzy Genetic Algorithm classifier, consisting of if_terms and a class_label.
- Fuzzy_functions Class: Implements fuzzy logic operations and tests rules against input data.
- data_preprocessing Class: Handles data preprocessing, including tokenization and dimensionality reduction using PCA.
- genetic_algorithm Class: Implements the genetic algorithm to generate and evolve fuzzy rules for classification.
- TfidfVectorizer: Converts text data into a numerical feature matrix using TF-IDF vectorization.
- PCA: Performs dimensionality reduction using Principal Component Analysis.
Contributions to improve the classifier or add new features are welcome! Feel free to open a pull request with your changes.
Good Luck!