Geopatial Analysis Project Involving (Zomata Case Study) on Restuarants,Hotels in Bangalore.
download dataset here https://drive.google.com/drive/u/0/folders/11-efbrZ5FZRPtcHaytKbtqwA20DZiORn
To find the % missing of values
, use a list_conprehension
iterate the columns to return all the features with null_na
. and calculating the % missing of values
we return the sum of null_na
and divided it by the lenght of df
and multiply by 100. {df[features].isnull().sum()/len(df)*100}
To deal with the missing values, a column with highest number of missing values(rate
) was choosing and manipulations where done to remove and replace missing values other than null_na
.
call a groupby
function on the name
and rate
column to return the mean of all the restaurants.
Problem Statement => Get Distributions of Rating Column & try to find out what Distribution this Feature Support.
Used displot
from searborn
to visualized Restuarants with highest Avg. ratings.
use value_counts()
on the name
column to return the number of restaurants and there outlet. you can visualize using barplot
Do a value_counts
on the online_order
column and visualize using pie chart.
Problem Statement => Ratio Between Restauarants that Provide Table and Restaurants that do not Provide Table.
Do a value_counts
on book_table
and visualize using graph_objs
from plotly
Different plotly Extension:
-
import plotly.express as px
plotingpx.pie()
thepie
is lowercase. -
import plotly.graph_objs as go
plotinggo.Pie()
thePie
is uppercase. (graph_objs takes on uppercase)
To discover the most populated species of Rest. Drop the null
values in rest_type
columns, do a value_count
and visualize using Bar
plot.
group the name
and votes
of rest. and Visulize using Bar
Create two list to store the group rest.locations and rest.names count. apply sort_values()
function to return locations with highest number of rest.names_count.
Do a Value_count
on cuisines
column and Visualize using Bar
we need to remove the ','
in the uique
features before datatype
can be converted from obj
to int
and also before seaborn
can visualize it.
use Scatterplot
to find out there relationship and and apply hue
over the online_order
to come with a conslusion
over the top_rated restaurants that accept online_order.
Problem Statement => Is there any difference b/w Votes of Restaurants Accepting and not Accepting Online_Order.
Do a box plot
on Online Order
vs votes
to see the d/f.
Problem Statement => Is there any difference b/w Price of Restaurants Accepting and not Accepting Online_Order.
Do a box plot
on Online Order
vs Appro Cost (for two people)
to see the d/f.
find the max
price for two people and filter the df
to return rest with max_price.
Make a copy of the df
and reset name
as index
find the nlargest
and visualized similary for cheapest
rest find the nsmallest
and visualized.
Filter the Cost of two people on the rest less than 500.
Filter the rate
> 4 and approx_cost(for two people
columns
<=500 and return the len
of the unique
name
.
Filter the rate
> 4 and approx_cost(for two people
columns
<=500 and group
the total number of hotels according to locations.
Fileter the df
on Cost of two peop.
, rate
and return the location
and restaurants
list passed in the filter.
Do a value_count
on the location
and visualized the most foodie areas.
In this project, get the locations name
, get the location latitude and longitude
, Merge
with location Rest_count
and convert the latitude and longitude
to array and generate a baseMap
based on the default_location
and zoom_start
from the Restaurants_locations, to generate a HeatMap
apply the function on the lat
, lon
and count
of Restaurants locations and use value_tollist to convert to array and add it back to the basemap in other to visualized the HeatMpap on tne Restaurants locations.
Filter the north_indian
rest.count and merge
it with the locations df
and apply the HeatMap on the lat
and lon
from the basemap function to visualize.
group the name
of restaurants according to there rest_type
and agg
count, group again and sort according to url
to get a total count, filter out the name
, url
and rest_type
columns. To get Most Popular Casual Dining Restaurants Chains. Do a filter on the rest_type
.