This repository contains a comprehensive analysis of a worldwide hotel booking dataset. Our primary goal is to explore this dataset through Exploratory Data Analysis (EDA) and develop models to predict the likelihood of booking cancellations. By identifying trends and patterns, we aim to provide insights that can help in optimizing booking strategies and minimizing cancellations.
The dataset, Booking_Data.xlsx
, includes the following columns:
- Booking ID
- Hotel
- Booking Date
- Arrival Date
- Lead Time
- Nights
- Guests
- Distribution Channel
- Customer Type
- Country
- Deposit Type
- Avg Daily Rate
- Status
- Status Update
- Cancelled (0/1)
- Revenue
- Revenue Loss
It encompasses a wide array of data points from hotel bookings, including booking details, customer information, and financial aspects.
Booking_Data.xlsx
: The dataset file containing information on hotel bookings worldwide.case_study_competition.ipynb
: A Jupyter notebook that contains all the Python code used for data analysis and modeling.
To run the notebook and analyze the dataset, ensure you have Python installed along with the following libraries:
- pandas
- numpy
- matplotlib
- seaborn
- scikit-learn
- plotly
You can install all required libraries using the following command:
pip install pandas numpy matplotlib seaborn scikit-learn