Skip to content

Notebooks and functions, to detect fraudsters using a Revolut dataset from Kaggle

Notifications You must be signed in to change notification settings

macrodrigues/revolut_fraud_detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Revolut Fraud Detection

This project aims to build a model to detect fraudsters on a Revolut Dataset.

The dataset was downloaded from Kaggle, and it contains three different CSV files.

One called transactions.csv with information about each transaction, user_id, timestamp, etc. Another is called users.csv, which, as the name says, has information about the user: country, age, creation date, etc. And finally, the fraudsters.csv, which contains only the user_id of the fraudsters.

The project comprehends the following phases:

  1. Merging and cleaning the CSV files;
  2. Check if the data is balanced. In this case, it was not, so I applied undersampling of the majority class;
  3. Econding using Target Encoding and One Hot Encoding;
  4. Feature selection, for this I used Pearson Correlation;
  5. Try different Regression models. In the end, I chose the Random Forest Regressor Model
  6. Model Evaluation.

The full description of the project can be followed on this Medium post:

Material Bread logo

About

Notebooks and functions, to detect fraudsters using a Revolut dataset from Kaggle

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published