This repository contains everything you need to become proficient in Data Engineering
Pic credits: infra
Pre-requisite : Day 1 — Day 60 : Quick Recap of 60 days of Data Science and ML
Data Engineers - ML Engineers -- Data Scientists
Techniques to write efficient and optimized code
- SQL Basics
- Aggregations
- Window Functions
- BigQuery
SELECT, FROM, WHERE and Date and Extract in BigQuery
- Advanced Functions
- Performance Tuning SQL Queries
- MySQL, PostgreSQL and MongoDB
Comparison between MySQL and PostgreSQL and Mongo DB
Introduction to SQL and NoSQL Databases
- Scripting and Automation
ETL ( Extract, Tranform and Load) basics
- Relational Databases and SQL
- NoSQL Data bases and Map Reduce
13.Data Analysis
Stochastic regression imputation
Read and Process Large Datasets
Data Visualization using Plotly and Bokeh
Categorical and Numerical Features
- Data Processing Techniques
- Big Data
- Data Pipelines and WorkFlows
- Infrastructure
Most important Docker commands
- Power BI
Power BI — Data Analysis Expressions
- Cloud Data Engineering
Google Cloud Platform services
- Machine Learning Algorithms
Complete 60 Days of Data Science and Machine Learning Series
30 days of Machine Learning Ops
30 Days of Natural Language Processing ( NLP) Series
Data Science and Machine Learning Research ( papers) Simplified **
30 days of Data Engineering with projects Series
60 days of Data Science and ML Series with projects
100 days : Your Data Science and Machine Learning Degree Series with projects
23 Data Science Techniques You Should Know
Tech Interview Series — Curated List of coding questions
Complete System Design with most popular Questions Series
Complete Data Visualization and Pre-processing Series with projects
Complete Python Series with Projects
Complete Advanced Python Series with Projects
Kaggle Best Notebooks that will teach you the most
Complete Developers Guide to Git
Exceptional Github Repos — Part 1
Exceptional Github Repos — Part 2
All the Data Science and Machine Learning Resources
6 Highly Recommended Data Science and Machine Learning Courses that you MUST take ( with certificate) -
- Complete Data Scientist : https://bit.ly/3wiIo8u
Learn to run data pipelines, design experiments, build recommendation systems, and deploy solutions to the cloud.
- Complete Data Engineering : https://bit.ly/3A9oVs5
Learn to design data models, build data warehouses and data lakes, automate data pipelines, and work with massive datasets
- Complete Machine Learning Engineer : https://bit.ly/3Tir8ub
Learn advanced machine learning techniques and algorithms - including how to package and deploy your models to a production environment.
- Complete Data Product Manager : https://bit.ly/3QGUtwi
Leverage data to build products that deliver the right experiences, to the right users, at the right time. Lead the development of data-driven products that position businesses to win in their market.
- Complete Natural Language Processing : https://bit.ly/3T7J8qY
Build models on real data, and get hands-on experience with sentiment analysis, machine translation, and more.
- Complete Deep Learning: https://bit.ly/3T5ppIo
Learn to implement Neural Networks using the deep learning framework PyTorch