This repository contains a curated list of hands-on tutorials and materials that will guide you through your data-centric journey, focusing on achieving high-quality data to feed your machine learning models.
- Awesome Data Science Tools to Master in 2023: Data Profiling Edition, Feb 22, 2023. Included is the following material: YData Profiling Demo
, DataPrep Demo
, SweetViz Demo
, AutoViz Demo
, and Lux Demo
- Single line of code data profiling with Spark: Pandas Profiling in the Big Data landscape, Feb 17, 2023
- Auditing Data Quality with Pandas Profiling: The premier choice of data scientists, Jan 24, 2023
- How To Compare 2 Datasets With Pandas Profiling: A data quality use case with advanced EDA, Nov 25, 2022
- How to do an EDA for Time-Series: Pandas-profiling time-series exploratory analysis, Oct 22, 2022
- Data Quality Issues that Kill Your Machine Learning Models: Navigating the complexity of imperfect data, Jan 19, 2023
- How Can I Measure Data Quality?: An open-source package for comprehensive Data Quality, Sep 24, 2021
- The cost of poor data quality: Bad data makes data scientists work harder, not smarter!, Sep 7, 2020
- How to Validate the Quality of Your Synthetic Data: Combining ydata-synthetic with great expectations, Jan 20, 2022
- Synthetic Time-Series Data - A GAN approach: Generate synthetic sequential data with TimeGAN, Jan 27, 2021
- How to generate synthetic tabular data?: Wasserstein Loss for Generative Adversarial Networks, Sep 22, 2020
- Generating synthetic tabular data with GANs — Part 2, May 8, 2020
- Generating synthetic tabular data with GANs — Part 1, May 4, 2020
- Synthetic Data: The future standard for Data Science development, Apr 2, 2020
- Private ML with Tensorflow privacy: Building your first privacy-preserving model with TF-Privacy, Jun 1, 2020
- What is Differential Privacy?: Does it live up to the hype?, May 28, 2020
- Privacy preserving Machine Learning: A set of techniques to ensure privacy while exploring data, Mar 10, 2020
- The impact of Machine Learning in data privacy, Mar 2, 2020
If you found these resources useful, please feel free to check out our Data-Centric AI Community or click here to join our Discord server. See you on the other side 🖖