Course materials for General Assembly's Data Science course in Washington, DC (10/2/14 - 12/18/14). View student work in the student repository.
Instructors: Josiah Davis and Kevin Markham
Week | Tuesday | Thursday |
---|---|---|
0 | 10/2: Introduction | |
1 | 10/7: Git and GitHub | 10/9: Base Python |
2 | 10/14: Getting and Cleaning Data | 10/16: Exploratory Analysis |
3 | 10/21: Linear Regression Milestone: Question and Data Set |
10/23: Logistic Regression |
4 | 10/28: Machine Learning, KNN | 10/30: Model Evaluation |
5 | 11/4: Clustering Milestone: Data Exploration and Analysis Plan |
11/6: Naive Bayes and NLP |
6 | 11/11: Dimension Reduction | 11/13: Decision Trees |
7 | 11/18: Project Working Time Milestone: First Draft Due |
11/20: Ensembling: Random Forests |
8 | 11/25: Recommenders | Thanksgiving |
9 | 12/2: Ensembling: Boosting | 12/4: Neural Networks |
10 | 12/9: Review Milestone: Second Draft Due |
12/11: Project Working Time |
11 | 12/16: Project Presentations | 12/18: Project Presentations |
- Introduction to General Assembly
- Course overview and philosophy (slides)
- What is data science? (slides)
- Brief demo of Slack
Homework:
- Install Anaconda distribution of Python 2.7, Git, and Slack
- Add a photo to your Slack profile
- Create a GitHub account
- Read Analyzing the Analyzers (40 pages) and think about where you'd like to fit in!
Optional:
- Subscribe to some data-focused newsletters, to keep current: Center for Data Innovation, O'Reilly Data Newsletter, Data Community DC
- Watch Introduction to Data Science and Analysis (50 minutes) for another look at the data science workflow
- Find an open source project hosted on GitHub that interests you
- Homework discussion: Any installation issues? Find any interesting GitHub projects? Any takeaways from "Analyzing the Analyzers"?
- Introduce yourself: What's your technical background? Why did you join this course? How do you define success in this course?
- Office hours
- Git and GitHub lesson (slides)
- Create a repo on GitHub, clone it, make changes, and push up to GitHub
- Fork the DAT3-students repo, clone it, add a Markdown file (
about.md
) in your folder, push up to GitHub, and create a pull request
- Discuss the course project
Homework:
- Review the course project information, past projects from other GA students, and public data sources
Optional:
- Clone this repo (DAT3) for easy access to the course files
- Watch Introduction to Git and GitHub (36 minutes) to repeat a lot of today's presentation
- Read the first two chapters of Pro Git for a much deeper understanding of version control and the basic Git commands
- Learn some more Markdown and add it to your
about.md
file, then push those edits to GitHub and send another pull request - Read this friendly command line tutorial if you are brand new to the command line
- For more project inspiration, browse the student projects from Andrew Ng's Machine Learning course at Stanford
Resources:
- Dillinger is a browser-based Markdown editor, useful for checking your Markdown code
- GitRef is an excellent reference guide for Git commands
- Git quick reference for beginners is a shorter reference guide with commands grouped by workflow
- Homework discussion: Any questions about Git/GitHub? What's one thing you learned from reviewing student projects?
- Why are we programming? Why are we using Python?
- Base Python lesson (code)