This repository contains a Jupyter notebook for analyzing the famous Iris dataset using the Seaborn library. The goal is to demonstrate how to load, visualize, and analyze data with Seaborn and pandas.
Make sure you have the following libraries installed:
- pandas
- seaborn
- matplotlib
- numpy
- scikitlearn
You can install these libraries using pip:
pip install pandas
pip install seaborn
pip install matplotlib
pip install numpy
pip install scikit-learn
This Jupyter notebook contains various code blocks that perform different tasks for data analysis and visualization. Below, we explain each of the code blocks present in the Seaborniris.ipynb file.
First, we import the necessary libraries for data analysis and visualization.
# For working with DataFrames and data manipulation
import pandas as pd
# For statistical visualizations
import seaborn as sns
# For creating plots
import matplotlib.pyplot as plt
# For numerical operations
import numpy as np
# To access datasets and tools from scikit-learn
from sklearn import datasets \
We create a sequence of x values ranging from -5 to 5, with 100 equally spaced points.
We load the Iris dataset using the load_iris function from Scikit-learn and convert it into a pandas DataFrame.
# Load the Iris dataset
iris_data = load_iris()
# Convert to DataFrame
iris = pd.DataFrame(data=iris_data.data, columns=iris_data.feature_names)
iris['target'] = iris_data.target
We visualize the data using the Seaborn library. First, we configure the style of the plots.
# Configure the style of the plots
sns.set(style="whitegrid")
Exemple:
sns.pairplot(df, hue='target')
plt.show()
We visualize the distribution of the features with a pairplot.
We visualize the data using the Seaborn library. First, we configure the style of the plots.
sns.pairplot(iris, hue='target')
plt.show()
We create a boxplot of the sepal length by species.
plt.figure(figsize=(10, 6))
sns.boxplot(x='target', y='sepal length (cm)', data=iris, color='b')
plt.title('Boxplot of Sepal Length by Species')
plt.xlabel('Species')
plt.ylabel('Sepal Length (cm)')
plt.show()
We perform exploratory data analysis to better understand the features and the distribution of the classes.
Descriptive Statistics
# Descriptive statistics
print(iris.describe())
# Count of each class
print(iris['target'].value_counts())
We create a sequence of x values ranging from -5 to 5, with 100 equally spaced points.
x = np.linspace(-5, 5, 100)
We calculate the derivatives of a function f4 at each point in x using two different approaches: a function derivada and a function f4_prime_exato. The results are stored in the lists y2 and _y3, respectively.
y2 = []
y3 = []
for xx in x:
y2.append(derivada(f4, xx))
y3.append(f4_prime_exato(xx))
We use the matplotlib library to plot the results of the calculated derivatives. The solid line (-) represents the values calculated by the derivada function, while the dashed line (--) represents the values calculated by the f4_prime_exato function.
plt.plot(x, y2, '-', x, y3, '--')
plt.show()
##Running the Notebook
To run the notebook, you can use Jupyter Notebook or JupyterLab. Execute the following command to start Jupyter Notebook:
jupyter notebook
Open the Seaborniris .ipynb file and run the cells to see the results.