ud120-projects - https://br.udacity.com/course/intro-to-machine-learning--ud120/
Python code for Udacity Introduction to Machine Learning course. Some code files were written by myself in order to achieve different results from the given tests.
Get started (check course link)
- Download enron mail dataset at: https://www.cs.cmu.edu/~./enron/
- Run: python tools/startup.py
- fabiano_tutoriais folder has some random files which helped me to understand most of the initial concepts
Links:
Class imbalance problem:
-
http://storm.cis.fordham.edu/~gweiss/papers/dmin07-weiss.pdf
-
http://abricom.org.br/wp-content/uploads/2016/03/bricsccicbic2013_submission_20.pdf
Scikit contrib on imbalaced data
That said, here is a rough outline of useful approaches. These are listed approximately in order of effort:
- Do nothing. Sometimes you get lucky and nothing needs to be done. You can train on the so-called natural (or stratified) distribution and sometimes it works without need for modification.
- Balance the training set in some way:
- Oversample the minority class.
- Undersample the majority class.
- Synthesize new minority classes.
- Throw away minority examples and switch to an anomaly detection framework.
- At the algorithm level, or after it:
- Adjust the class weight (misclassification costs).
- Adjust the decision threshold.
- Modify an existing algorithm to be more sensitive to rare classes.
- Construct an entirely new algorithm to perform well on imbalanced data.
Ensemble method Machine Learning
Basic Concepts in Machine Learning
Python begginer - Code Academy
Introducing: Machine Learning in R
Your First Machine Learning Project in R Step-By-Step (tutorial and template for future projects)
Python vs R for machine learning
Pros and Cons of R vs Python Sci-kit learn
Should you teach Python or R for data science?
Unofficial Windows Binaries for Python Extension Packages - Wheels
Top 6 errors novice machine learning engineers make
Becoming a Machine Learning Engineer | Step 2: Pick a Process
Becoming a Machine Learning Engineer | Step 3: Pick Your Tool
Parametric and Nonparametric Machine Learning Algorithms
ML Mind Map
My Solution to the Galaxy Zoo Challenge
Object Recognition with Convolutional Neural Networks in the Keras Deep Learning Library
TODO TensorFlow Demo: MNIST for ML Beginners
TODO Your First Machine Learning Project in Python Step-By-Step
Undertand Bayes Theorem (Posterior, Likelihood, Prior and Evidence)
Awesome Machine Learning
How do I learn Machine Learning?
Embrace Randomness in Machine Learning
Redes Neurais Artificiais
https://pt.stackoverflow.com/questions/192098/como-funciona-uma-rede-neural-artificial https://pt.stackoverflow.com/questions/61187/como-implementar-a-camada-oculta-em-uma-rede-neural-de-reconhecimento-de-caracte https://pt.stackoverflow.com/questions/40135/explicar-o-algoritmo-svr/40149#40149 https://stackoverflow.com/questions/2480650/role-of-bias-in-neural-networks
Análise de Componentes Principais
http://iamtrask.github.io/2015/07/12/basic-python-network/
Deep Learning Book
Keras Cheat Sheet: Neural Networks in Python
- https://www.datacamp.com/community/blog/keras-cheat-sheet
- https://www.datacamp.com/community/tutorials/deep-learning-python
- https://www.datacamp.com/community/tutorials/keras-r-deep-learning
How to Setup a Python Environment for Machine Learning and Deep Learning with Anaconda
- https://machinelearningmastery.com/setup-python-environment-machine-learning-deep-learning-anaconda/
Building a Student Intervention System
Machine Learning Algorithms: Which One to Choose for Your Problem
If you have already closed the Anaconda navigator, open cmd and type jupyter-notebook list.
Then you can kill the port using following commands: netstat -o -n -a | findstr :3000 TCP 0.0.0.0:3000 0.0.0.0:0 LISTENING 3116 taskkill /F /PID 3116
Comparison of 14 different families of classification algorithms on 115 binary datasets
Machine Learning Algorithms for Classification
In what real world applications is Naive Bayes classifier used?
Support Vector Machines and Kernel Methods