Skip to content
/ EDA Public

Exploratory data analysis and feature engineering practice

Notifications You must be signed in to change notification settings

Chirag314/EDA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pandas Basics Readme This document describes the basic functionalities of Pandas, a popular Python library for data analysis.

  1. Data Structure:

Pandas works with two primary data structures:

Series: One-dimensional labeled array. DataFrame: Two-dimensional labeled data structure with columns and rows. 2. Data Loading and Manipulation:

Importing data: CSV: pd.read_csv("filename.csv") Excel: pd.read_excel("filename.xlsx") JSON: pd.read_json("filename.json") Data selection and manipulation: Basic indexing and slicing. .loc and .iloc for advanced indexing. Filtering data using boolean masks. Data manipulation functions like fillna, dropna, and drop. Data manipulation with conditional statements: Using if statements and boolean operators. Applying functions to subsets of data with .apply. 3. Data Cleaning and Preprocessing:

Missing values: Detecting missing values with isnull and isna. Imputing missing values with fillna and other methods. Data type conversion: Converting data types with astype. Handling categorical data with factorize and get_dummies. 4. Data Aggregation and Statistics:

Descriptive statistics: describe method for basic statistics. mean, median, std, etc. Aggregation functions: sum, mean, min, max, etc. groupby for group-level statistics. Pivot tables for summarizing and analyzing data. 5. Data Visualization:

Matplotlib and Seaborn for creating various plots and charts. .plot method for basic visualizations. hist, bar, scatter, etc. for specific types of plots. 6. Other Useful Functions:

Merging and joining DataFrames: pd.merge for combining data from different sources. Time series analysis: to_datetime for converting dates to timestamps. Time series-specific functions like resample and rolling. 7. Resources:

Official Pandas documentation: https://pandas.pydata.org/docs/ Tutorials and examples: https://m.youtube.com/watch?v=cXP_i5-nTXg Online communities for help and support: https://latitudes.org/forums/forum/17-pans-pandas-lyme-included/ 8. Additional Notes:

This Readme provides a basic overview of Pandas functionalities. For comprehensive information, refer to the official documentation and resources listed above. Many additional functionalities and libraries can be used with Pandas for advanced data analysis. This Readme should be a helpful guide for beginners to start exploring the world of Pandas for data analysis. Feel free to experiment and explore further to enhance your skills and knowledge.

About

Exploratory data analysis and feature engineering practice

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published