|
| 1 | +|License| |GitHub forks| |GitHub stars| |
| 2 | + |
| 3 | +Python Machine Learning Notebooks (Tutorial style) |
| 4 | +================================================== |
| 5 | + |
| 6 | +Authored and maintained by Dr. Tirthajyoti Sarkar, Fremont, CA. `Please |
| 7 | +feel free to add me on LinkedIn |
| 8 | +here <https://www.linkedin.com/in/tirthajyoti-sarkar-2127aa7>`__. |
| 9 | + |
| 10 | +-------------- |
| 11 | + |
| 12 | +Requirements |
| 13 | +------------ |
| 14 | + |
| 15 | +- Python 3.5 |
| 16 | +- NumPy (``pip install numpy``) |
| 17 | +- Pandas (``pip install pandas``) |
| 18 | +- Scikit-learn (``pip install scikit-learn``) |
| 19 | +- SciPy (``pip install scipy``) |
| 20 | +- Statsmodels (``pip install statsmodels``) |
| 21 | +- MatplotLib (``pip install matplotlib``) |
| 22 | +- Seaborn (``pip install seaborn``) |
| 23 | +- Sympy (``pip install sympy``) |
| 24 | + |
| 25 | +-------------- |
| 26 | + |
| 27 | +You can start with this article that I wrote in Heartbeat magazine (on |
| 28 | +Medium platform): |
| 29 | + |
| 30 | +`“Some Essential Hacks and Tricks for Machine Learning with |
| 31 | +Python” <https://heartbeat.fritz.ai/some-essential-hacks-and-tricks-for-machine-learning-with-python-5478bc6593f2>`__ |
| 32 | + |
| 33 | +Essential tutorial-type notebooks on Pandas and Numpy |
| 34 | +----------------------------------------------------- |
| 35 | + |
| 36 | +Jupyter notebooks covering a wide range of functions and operations on |
| 37 | +the topics of NumPy, Pandans, Seaborn, matplotlib etc. |
| 38 | + |
| 39 | +- `Basic Numpy |
| 40 | + operations <https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Pandas%20and%20Numpy/Basics%20of%20Numpy%20arrays.ipynb>`__ |
| 41 | +- `Basic Pandas |
| 42 | + operations <https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Pandas%20and%20Numpy/Basics%20of%20Pandas%20DataFrame.ipynb>`__ |
| 43 | +- `Basics of visualization with Matplotlib and descriptive |
| 44 | + stats <https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Pandas%20and%20Numpy/Basics%20of%20Matplotlib%20and%20Descriptive%20Statistics.ipynb>`__ |
| 45 | +- `Advanced Pandas |
| 46 | + operations <https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Pandas%20and%20Numpy/Advanced%20Pandas%20Operations.ipynb>`__ |
| 47 | +- `How to read various data |
| 48 | + sources <https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Pandas%20and%20Numpy/Read_data_various_sources/How%20to%20read%20various%20sources%20in%20a%20DataFrame.ipynb>`__ |
| 49 | +- `PDF reading and table processing |
| 50 | + demo <https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Pandas%20and%20Numpy/Read_data_various_sources/PDF%20table%20reading%20and%20processing%20demo.ipynb>`__ |
| 51 | +- `How fast are Numpy operations compared to pure Python |
| 52 | + code? <https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Pandas%20and%20Numpy/How%20fast%20are%20NumPy%20ops.ipynb>`__ |
| 53 | + (Read my |
| 54 | + `article <https://towardsdatascience.com/why-you-should-forget-for-loop-for-data-science-code-and-embrace-vectorization-696632622d5f>`__ |
| 55 | + on Medium related to this topic) |
| 56 | +- `Fast reading from Numpy using .npy file |
| 57 | + format <https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Pandas%20and%20Numpy/Numpy_Reading.ipynb>`__ |
| 58 | + (Read my |
| 59 | + `article <https://towardsdatascience.com/why-you-should-start-using-npy-file-more-often-df2a13cc0161>`__ |
| 60 | + on Medium on this topic) |
| 61 | + |
| 62 | +Regression |
| 63 | +---------- |
| 64 | + |
| 65 | +- Simple linear regression with t-statistic generation (`Here is the |
| 66 | + Notebook <https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Regression/Linear_Regression_Practice.ipynb>`__) |
| 67 | + |
| 68 | +- Multiple ways to perform linear regression in Python and their speed |
| 69 | + comparison (`Here is the |
| 70 | + Notebook <https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Regression/Linear_Regression_Methods.ipynb>`__). |
| 71 | + Also `check the article I wrote on |
| 72 | + freeCodeCamp <https://medium.freecodecamp.org/data-science-with-python-8-ways-to-do-linear-regression-and-measure-their-speed-b5577d75f8b>`__ |
| 73 | + |
| 74 | +- Multi-variate regression with regularization (`Here is the |
| 75 | + Notebook <https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Regression/Multi-variate%20LASSO%20regression%20with%20CV.ipynb>`__) |
| 76 | + |
| 77 | +- Polynomial regression using **scikit-learn pipeline feature** (`Here |
| 78 | + is the |
| 79 | + Notebook <https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Regression/Regularized%20polynomial%20regression%20with%20linear%20and%20random%20sampling.ipynb>`__). |
| 80 | + Also `check the article I wrote on Towards Data |
| 81 | + Science <https://towardsdatascience.com/machine-learning-with-python-easy-and-robust-method-to-fit-nonlinear-data-19e8a1ddbd49>`__. |
| 82 | + |
| 83 | +- Decision trees and Random Forest regression (showing how the Random |
| 84 | + Forest works as a robust/regularized meta-estimator rejecting |
| 85 | + overfitting) (`Here is the |
| 86 | + Notebook <https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Regression/Random_Forest_Regression.ipynb>`__). |
| 87 | + |
| 88 | +- Detailed visual analytics and goodness-of-fit diagnostic tests for a |
| 89 | + linear regression problem (`Here is the |
| 90 | + Notebook <https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Regression/Regression_Diagnostics.ipynb>`__). |
| 91 | + |
| 92 | +-------------- |
| 93 | + |
| 94 | +Classification |
| 95 | +-------------- |
| 96 | + |
| 97 | +- Logistic regression/classification (`Here is the |
| 98 | + Notebook <https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Classification/Logistic_Regression_Classification.ipynb>`__). |
| 99 | + |
| 100 | +- *k*-nearest neighbor classification (`Here is the |
| 101 | + Notebook <https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Classification/KNN_Classification.ipynb>`__). |
| 102 | + |
| 103 | +- Decision trees and Random Forest Classification (`Here is the |
| 104 | + Notebook <https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Classification/DecisionTrees_RandomForest_Classification.ipynb>`__). |
| 105 | + |
| 106 | +- Support vector machine classification (`Here is the |
| 107 | + Notebook <https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Classification/Support_Vector_Machine_Classification.ipynb>`__). |
| 108 | + Also `check the article I wrote in Towards Data Science on SVM and |
| 109 | + sorting |
| 110 | + algorithm <https://towardsdatascience.com/how-the-good-old-sorting-algorithm-helps-a-great-machine-learning-technique-9e744020254b>`__. |
| 111 | + |
| 112 | +- Naive Bayes classification (`Here is the |
| 113 | + Notebook <https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Classification/Naive_Bayes_Classification.ipynb>`__). |
| 114 | + |
| 115 | +-------------- |
| 116 | + |
| 117 | +Clustering |
| 118 | +---------- |
| 119 | + |
| 120 | +- *K*-means clustering (`Here is the |
| 121 | + Notebook <https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Clustering-Dimensionality-Reduction/K_Means_Clustering_Practice.ipynb>`__). |
| 122 | + |
| 123 | +- Affinity propagation (showing its time complexity and the effect of |
| 124 | + damping factor) (`Here is the |
| 125 | + Notebook <https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Clustering-Dimensionality-Reduction/Affinity_Propagation.ipynb>`__). |
| 126 | + |
| 127 | +- Mean-shift technique (showing its time complexity and the effect of |
| 128 | + noise on cluster discovery) (`Here is the |
| 129 | + Notebook <https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Clustering-Dimensionality-Reduction/Mean_Shift_Clustering.ipynb>`__). |
| 130 | + |
| 131 | +- DBSCAN (showing how it can generically detect areas of high density |
| 132 | + irrespective of cluster shapes, which the k-means fails to do) (`Here |
| 133 | + is the |
| 134 | + Notebook <https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Clustering-Dimensionality-Reduction/DBScan_Clustering.ipynb>`__). |
| 135 | + |
| 136 | +- Hierarchical clustering with Dendograms showing how to choose optimal |
| 137 | + number of clusters (`Here is the |
| 138 | + Notebook <https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Clustering-Dimensionality-Reduction/Hierarchical_Clustering.ipynb>`__). |
| 139 | + |
| 140 | +-------------- |
| 141 | + |
| 142 | +Dimensionality reduction |
| 143 | +------------------------ |
| 144 | + |
| 145 | +- Principal component analysis (`Here is the |
| 146 | + Notebook <https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Clustering-Dimensionality-Reduction/Principal%20Component%20Analysis.ipynb>`__) |
| 147 | + |
| 148 | +-------------- |
| 149 | + |
| 150 | +Random data generation using symbolic expressions |
| 151 | +------------------------------------------------- |
| 152 | + |
| 153 | +- Simple script to generate random polynomial expression/function |
| 154 | + (`Here is the |
| 155 | + Notebook <https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Random%20Function%20Generator/Random_function_generator.ipynb>`__). |
| 156 | + |
| 157 | +- How to use `Sympy package <https://www.sympy.org/en/index.html>`__ to |
| 158 | + generate random datasets using symbolic mathematical expressions |
| 159 | + (`Here is the |
| 160 | + Notebook <https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Random%20Function%20Generator/Symbolic%20regression%20classification%20generator.ipynb>`__). |
| 161 | + Also, `here is the Python |
| 162 | + script <https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Random%20Function%20Generator/Symbolic_regression_classification_generator.py>`__ |
| 163 | + if anybody wants to use it directly in their project. |
| 164 | + |
| 165 | +- Here is my article on Medium on this topic: `Random regression and |
| 166 | + classification problem generation with symbolic |
| 167 | + expression <https://towardsdatascience.com/random-regression-and-classification-problem-generation-with-symbolic-expression-a4e190e37b8d>`__ |
| 168 | + |
| 169 | +-------------- |
| 170 | + |
| 171 | +Simple deployment examples (serving ML models on web API) |
| 172 | +--------------------------------------------------------- |
| 173 | + |
| 174 | +- `Serving a linear regression model through a simple HTTP server |
| 175 | + interface <https://github.com/tirthajyoti/Machine-Learning-with-Python/tree/master/Deployment/Linear_regression>`__. |
| 176 | + User needs to request predictions by executing a Python script. Uses |
| 177 | + ``Flask`` and ``Gunicorn``. |
| 178 | + |
| 179 | +- `Serving a recurrent neural network (RNN) through a HTTP |
| 180 | + webpage <https://github.com/tirthajyoti/Machine-Learning-with-Python/tree/master/Deployment/rnn_app>`__, |
| 181 | + complete with a web form, where users can input parameters and click |
| 182 | + a button to generate text based on the pre-trained RNN model. Uses |
| 183 | + ``Flask``, ``Jinja``, ``Keras``/``TensorFlow``, ``WTForms``. |
| 184 | + |
| 185 | +-------------- |
| 186 | + |
| 187 | +Object-oriented programming with machine learning |
| 188 | +------------------------------------------------- |
| 189 | + |
| 190 | +Implementing some of the core OOP principles in a machine learning |
| 191 | +context by `building your own Scikit-learn-like estimator, and making it |
| 192 | +better <https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/OOP_in_ML/Class_MyLinearRegression.ipynb>`__. |
| 193 | + |
| 194 | +`Here is the complete Python script with the linear regression |
| 195 | +class <https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/OOP_in_ML/Class_MyLinearRegression.py>`__, |
| 196 | +which can do fitting, prediction, cpmputation of regression metrics, |
| 197 | +plot outliers, plot diagnostics (linearity, constant variance, etc.), |
| 198 | +compute variance inflation factors. |
| 199 | + |
| 200 | +See my articles on Medium on this topic. |
| 201 | + |
| 202 | +- `Object-oriented programming for data scientists: Build your ML |
| 203 | + estimator <https://towardsdatascience.com/object-oriented-programming-for-data-scientists-build-your-ml-estimator-7da416751f64>`__ |
| 204 | + |
| 205 | +- `How a simple mix of object-oriented programming can sharpen your |
| 206 | + deep learning |
| 207 | + prototype <https://towardsdatascience.com/how-a-simple-mix-of-object-oriented-programming-can-sharpen-your-deep-learning-prototype-19893bd969bd>`__ |
| 208 | + |
| 209 | +.. |License| image:: https://img.shields.io/badge/License-BSD%202--Clause-orange.svg |
| 210 | + :target: https://opensource.org/licenses/BSD-2-Clause |
| 211 | +.. |GitHub forks| image:: https://img.shields.io/github/forks/tirthajyoti/Machine-Learning-with-Python.svg |
| 212 | + :target: https://github.com/tirthajyoti/Machine-Learning-with-Python/network |
| 213 | +.. |GitHub stars| image:: https://img.shields.io/github/stars/tirthajyoti/Machine-Learning-with-Python.svg |
| 214 | + :target: https://github.com/tirthajyoti/Machine-Learning-with-Python/stargazers |
0 commit comments