Folders with all materials for specific task/domain
- [AR VR]
- [Class Imbalance Problem]
- [Cloud Computing]
- [AWS (Amazon Web Services)]
- [Data Analysis]
- [Data Analytics]
- [Data Engineering]
- [Data Preprocessing]
- [Data Processing]
- [Data Science Life Cycle Methodologies]
- [Data Warehouse]
- [Data-Centric AI]
- [Data]
- [Graph Neural Networks]
- [Machine Learning Ops (MLOps)]
- [Optimization]
- [Overfitting]
- [Pipelines]
- [SQL]
- [Statistics]
- [Tools and Tips]
Title | Description |
---|---|
MIT OpenCourseWare |
|
Title | Description |
---|---|
Introduction to Computational Thinking |
Title | Description |
---|---|
MIT 18.S096 Topics in Mathematics w Applications in Finance |
The purpose of the class is to expose undergraduate and graduate students to the mathematical concepts and techniques used in the financial industry. Mathematics lectures are mixed with lectures illustrating the corresponding application in the financial industry. MIT mathematicians teach the mathematics part while industry professionals give the lectures on applications in finance. |
- Специализация Наука о данных для руководителей
- Machine Learning Foundations
Machine Learning Foundations: Linear Algebra, Calculus, Statistics & Computer Science
Title | Description |
---|---|
Data Science for Beginners - A Curriculum | Azure Cloud Advocates at Microsoft are pleased to offer a 10-week, 20-lesson curriculum all about Data Science. Each lesson includes pre-lesson and post-lesson quizzes, written instructions to complete the lesson, a solution, and an assignment. Our project-based pedagogy allows you to learn while building, a proven way for new skills to 'stick'. |
Machine Learning for Beginners - A Curriculum | Azure Cloud Advocates at Microsoft are pleased to offer a 12-week, 26-lesson curriculum all about Machine Learning. In this curriculum, you will learn about what is sometimes called classic machine learning, using primarily Scikit-learn as a library and avoiding deep learning, which is covered in our forthcoming 'AI for Beginners' curriculum. |
start-machine-learning | A complete guide to start and improve in machine learning (ML), artificial intelligence (AI) in 2021 without ANY background in the field and stay up-to-date with the latest news and state-of-the-art techniques |
[Data Science Specialization | John Hopkins Coursera](https://github.com/mGalarnyk/datasciencecoursera) |
- Python Data Science Handbook
- Hands-On Machine Learning with Scikit-Learn and TensorFlow
- Machine Learning Notebooks
-
A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in python using Scikit-Learn and TensorFlow.
-
- Machine Learning Notebooks
Title | Description |
---|---|
Awesome Artificial Intelligence (AI) | A curated list of Artificial Intelligence (AI) courses, books, video lectures and papers. |
ml-surveys | Survey papers summarizing advances in deep learning, NLP, CV, graphs, reinforcement learning, recommendations, graphs, etc. |
awesome-analytics-engineering | Awesome list of resources for analytics engineers. |
Title | Description |
---|---|
Weight Watcher | WeightWatcher (WW): is an open-source, diagnostic tool for analyzing Deep Neural Networks (DNN), without needing access to training or even test data. |
Title | Description, Information |
---|---|
2021: A Year Full of Amazing AI papers- A Review / 📌 [work in progress...] | A curated list of the latest breakthroughs in AI by release date with a clear video explanation, link to a more in-depth article, and code. [work in progress] |
- TensorFlow Developer Certificate
- Certified Analytics Professional (CAP)
- Cloudera Certified Associate: Data Analyst
- Cloudera Certified Professional: CCP Data Engineer
- Data Science Council of America (DASCA) Senior Data Scientist (SDS)
- Data Science Council of America (DASCA) Principal Data Scientist (PDS)
- Dell EMC Data Science Track
- Google Certified Professional Data Engineer
- Google Data and Machine Learning
- IBM Data Science Professional Certificate
- Microsoft MCSE: Data Management and Analytics
- Microsoft Certified Azure Data Scientist Associate
- Open Certified Data Scientist
- SAS Certified Advanced Analytics Professional
- SAS Certified Big Data Professional
- SAS Certified Data Scientist
- Live Webinars & On-demand Recordings by ODSC COMMUNITY
- Data Science fwdays'19 (playlist)
- Webinars 2020, Computer Science UCU
- Eastern European Machine Learning Summer School, 2020 (Deep Learning and Reinforcement Learning
- Program
- Practical Sessions 2020, GitHub Repository
- Lex Fridman Podcast | Artificial Intelligence (AI)
- Machine Learning, Andrew Ng, Stanford
- Awesome Data Podcasts
-
- Software Engineering Blogs
-
A curated list of engineering blogs
- AWS Machine Learning Blog
- The Netflix Tech Blog
- Uber Engineering
- NVIDIA Developer
- Towards AI
- Tutorials
-
AI-related tutorials.
-
- Tutorials
- Data Notes
- Louis Bouchard | @What's AI - Making AI Accessible
- Michael Galarnyk
- Data Science Dojo
Title | Description |
---|---|
Coursera Comminity Data Science | |
Locally Optimistic | A community for current and aspiring data analytics leaders. Started in NYC in early 2018 as an outgrowth of a slack channel / extremely informal meetup group, we hope to share our thoughts / opinions / experiences / trials / tribulations with others in the community. |
Deepchecks Community | A place to talk about MLOps news, articles, conferences, and really just anything in the MLOps space. |
- DataScience Digest
- Collection of the top articles, videos, events, books and jobs on Machine Learning, Deep Learning, NLP, Computer Vision and other aspects of Data Science.
The research made by Faculty of Applied Sciences at UCU. Link on main article.
- Linear algebra. Calculus. Statistics and Probability Theory.
- Machine Learning Algorithms: regression, simulation, scenario analysis, modeling, clustering, decision trees, etc.
- Python 3, Pandas, Scikit Learn, Keras, Tensor Flow, Numpy, PyTorch.
- Data visualization.
- Software engineering methodologies, functional programming or object-oriented programming.
- DevOps: containerization and orchestration.
- Classic DBs (relational or object): MySQL, PostgreSQL, RDS.
- NoSQL (documented): MongoDB, Cassandra, HBase, Elasticsearch, Redis, DynamoDB.
- NewSQL (hybrid/in memory): Memsql, VoltDB.
- Query engines: Impala, Presto.
- Cloud platforms (GCP, AWS). Cloud computation (Dataflow, Dataproc). Streaming (Pub/Sub, Kafka). Data storage (BigQuery, Cloud SQL, Cloud Spanner, Firestore, BigTable).
- ETL Concepts / Processes.
- Data Warehouse technologies, Data Lake architecture.
- Data modeling: Bachman diagrams, Chen’s Notation, Object-relational mapping, etc.
- Processing frameworks: Apache Spark (Pyspark/SparkR/sparklyr), Flink, Beam, Kafka streams
- Data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
- Python (PyCharm, Pandas, NumPy, bs4, sklearn, scipy). R.
- Linear algebra. Calculus. Statistics.
- Machine Learning techniques (Decision Trees, Random Forest, SVM, Bayesian, XG Boost, K-Nearest Neighbors) and concepts: regression and classification, clustering, feature selection, feature engineering, the curse of dimensionality, bias-variance tradeoff, SVMs.
- Data visualization.
- Data Mining (Clustering, Frequent Pattern Mining, Outliers Detection).
- Neural Networks and ML Packages (sklearn/sqboost/Tensorflow/Keras, H20).
- Cloud platforms (GCP, AWS). Cloud computation (Dataflow, Dataproc). Streaming (Pub/Sub, Kafka). Data storage (BigQuery, Cloud SQL, Cloud Spanner, Firestore, BigTable).
- Databases: SQL and non-SQL, AWS cloud storage, GDPR data privacy.
- Processing frameworks: Hadoop, Spark.
- Business Intelligence Software (Power BI, Tableau, Qlik, Cognos Analytics).
- Computer science fundamentals, algorithms, mathematics, linear algebra, probability, and statistics.
- Python (Pandas, Numpy, Scikit-Learn, Tensorflow, Keras).
- Python visualization tools: matplotlib/seaborn, Plotly.
- Machine Learning techniques (Decision Trees, Random Forest, SVM, Bayesian, XG Boost, K-Nearest Neighbors) and concepts: regression and classification, clustering, feature selection, feature engineering, the curse of dimensionality, bias-variance tradeoff, SVMs.
- Deep Learning: Recurrent Neural Network (LSTM/GRU units), Convolutional Neural Network.
- Machine learning frameworks (TensorFlow, Caffe2, PyTorch, Spark ML, scikit-learn) and ML techniques: GAN, ASR, RL.
- Databases: SQL and non-SQL. Hadoop ecosystem.
- Processing frameworks: Apache Spark (Pyspark/SparkR/sparklyr)
- Cloud platforms (GCP, AWS).
- Math, Statistics (regression, properties of distributions, statistical tests, and proper usage, etc.) and Probability Theory.
- Statistical programming software (R, Python, SAS, Matlab).
- Predictive analytics (regression models, time-series analysis and forecasting, survival or duration analysis).
- BI tools: Google Data Studio / Microsoft PowerBI / Tableau.
- Classic DBs: MySQL.
- MS Excel.
- A/B testing.
- Python (sklearn, nltk, gensim, spacy, Tensor Flow, PyTorch, Keras) and Python Data Science toolkit: Jupyter Notebook, Pandas, Numpy, Matplotlib/Seaborn, Scipy.
- Databases: SQL and NoSQL (MySQL, MongoDB, PostgreSQL ) .
- NLP libraries: NLTK, SpaCy, Stanford CoreNLP etc.
- NLP techniques for text representation: (TF-IDF, Word2Vec), semantic extraction, data structures and modeling.
- Methods of Information Extraction (NER, terminology extraction, keywords extraction, etc.)
- Machine Learning techniques and concepts (regression, trees, SVM, ensembles) for NLP tasks.
- Linear Algebra. Geometry. Calculus. Statistics and Probability theory.
- Python3, numpy, pandas, seaborn, scipy.
- Computer vision / image processing libraries such as: OpenCV, Pillow.
- Convolutional Neural Networks (LSTM, inception, residual, GAN).
- Neural network frameworks: TensorFlow, PyTorch.
- Computer vision algorithms and architectures: object detection, segmentation, face recognition, image processing, video processing.
- Real-time CV systems based on Deep Learning.
- Cloud model training (GCP, AWS), Cloud integration, Cloud Platforms.
- Performance metrics in object detection and classification, such as mAP and related.
- Big Data (Hadoop, Spark, Hive).
- Python3: numpy, scikit-learn, pandas, scipy.
- Statistics (regression, properties of distributions, statistical tests, and proper usage, etc.) and probability theory.
- Deep learning frameworks: Tensorflow, PyTorch; MxNet, Caffe, Keras.
- Deep learning architectures: VGG, ResNet, Inception, MobileNet.
- Deepnets, hyperparameter optimization, visualization, interpretation.
- Machine learning models.
- Software Engineering
- Applied Statistics
- Machine Learning
- Data Wrangling, Manipulation and Visualisation
- Descriptive statistics (What distribution does my data follow, what are the modes of the distribution, the expectation, the variance)
- Probability theory (Given my data follows a Binomial distribution, what is the probability of observing 5 paying customers in 10 click-through events)
- Hypothesis testing (forming the basis of any question on A/B testing, T-tests, anova, chi-squared tests, etc).
- Regression (Is the relationship between my variables linear, what are potential sources of bias, what are the assumptions behind the ordinary least squares solution)
- Bayesian Inference (What are some advantages/disadvantages vs frequentist methods)
- Introduction to Probability and Statistics, an open course on everything listed above including questions and an exam to help you test your knowledge.
- Machine Learning: A Bayesian and Optimization Perspective by Sergios Theodoridis. This is more a machine learning text than a specific primer on applied statistics, but the linear algebra approaches outlined here really help drive home the key statistical concepts on regression.