Welcome to the Medical Insurnace Project Repository! 🏥💉
This project focuses on the medical insurance domain, aiming to provide predictions related to health insurance premiums. SkyHigh Shield, a fictitious healthcare company, explores a dataset containing information about individuals' demographics, lifestyle choices, and medical history.
This project involves the development and deployment of a machine learning model to predict insurance charges based on various features such as age, gender, BMI, number of children, smoking habits, and region. The model is trained on a dataset containing historical insurance data.
- Python
- Jupyter Notebook
- Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn
- Statsmodels
- Pickle (for model serialization)
- Imported necessary libraries.
- Read the insurance dataset (insurance.csv) into a Pandas DataFrame.
- Checked the structure of the dataset and performed initial cleaning.
- Checked for outliers in the 'bmi' column using the IQR method.
- Removed outliers from the 'bmi' column.
- Removed duplicate rows from the dataset.
- Separated input and output columns.
- Split the dataset into training and testing sets.
- Removed instances where insurance charges were greater than 50,000.
- Applied a log transformation to the insurance charges to handle skewness.
- Created a data processing pipeline using sklearn to handle imputation, encoding, polynomial features, scaling, and model fitting.
- Trained a linear regression model using the pipeline.
- Evaluated the model on the test set, achieving an R-squared score of approximately 0.83.
- Exported the trained model using Pickle for future use.
- Loaded the deployed model and tested it with a sample input to predict insurance charges.
The insurance model has been successfully developed, trained, and deployed. It provides accurate predictions for insurance charges based on input features. Feel free to use the deployed model for predicting insurance charges with new data.
Note:
- The model file (insurance_model.pkl) is available in the repository for future use.
- To create a new environment, refer to the syntax provided in the 'cmd note.txt' file in this repository.