Instructor: Professor Richard (Rich) Vuduc Co-creators: Vaishnavi Eleti and Rachel Wiseley
This course is your hands-on introduction to programming techniques relevant to data analysis and machine learning. Most of the programming exercises will be based on Python and SQL.
You will build, "from scratch," the basic components of a data analysis pipeline: collection, preprocessing, storage, analysis, and visualization. You will see many examples of high-level data analysis questions, concepts and techniques for formalizing those questions into mathematical or computational tasks, and methods for translating those tasks into code. Beyond programming languages and best practices, you’ll learn elementary data processing algorithms, notions of program correctness and efficiency, and numerical methods for linear algebra and mathematical optimization.
The basic philosophy of this course is that you'll learn the material best by a combination of reading, thinking, and most importantly, actively doing. Therefore, you should make an effort to complete all assignments, including any "optional" parts.