This project aims to predict the default chances of a customer based on the payment history of the customer.The default rate of customers has a direct impact on the financials of a credit card company. It is important to predict and implement processes to attenuate and adopt methods to minimize this rate.
1.ID: ID of each client
2.LIMIT_BAL: Amount of given credit in NT dollars (includes individual and family/supplementary credit
3.SEX: Gender (1=male, 2=female)
4.EDUCATION: (1=graduate school, 2=university, 3=high school, 4=others, 5=unknown, 6=unknown)
5.MARRIAGE: Marital status (1=married, 2=single, 3=others)
6.AGE: Age in years
7.PAY_0 to PAY_6: Repayment status from April to September, 2005 (-1=pay duly, 1=payment delay for one month, 2=payment delay for two months,8=payment delay for eight months, 9=payment delay for nine months and above)
8.BILL_AMT1 to BILL_AMT6: Amount of bill statement from April to September, 2005 (NT dollar)
9.PAY_AMT1: Amount of previous payment from April to September, 2005 (NT dollar)
- Default.payment.next.month: Default payment (1=yes, 0=no)
Tech used:
- Python
- scikit-learn
- sqlite
- pandas
- numpy
- logger
- kneed ( python library for getting best k value)
Algorithms used
- KMeans( for clustering)
- Random Forest Classifier
- XGBoost Classifier
- Naive Bayes Classifier
Accuracy Metric
- AUC Score