Welcome to Week 1 of Obtaining Data! This course will focus on preparing you for collecting and cleaning data for downstream analysis and sharing.
One of the major components of a data scientist's job is to collect and clean data. Whether at a small organization or a major enterprise, the first step in using data is obtaining, cleaning and understanding the data. In this course we will focus on R packages and a few outside tools that can be used to collect data from a variety of sources, from Excel files to Databases. We will also cover a variety of formats including JSON, XML, and flat files (.csv, .txt).
The emphasis of this course is on creating tidy data sets that can be used in downstream analyses. Once you have mastered the material in this course you will be ready to learn about the techniques for exploring, analyzing, and summarizing data offered through our courses track or other Statistics, Data Science, or Machine Learning MOOCs.
Please see the course syllabus for information about the quizes, the project, due dates, and grading. Don't forget to say hi on the message boards. The community developed around these courses is one of the best places to learn and the best things about taking a MOOC!
Jeff Leek and the Data Science Track Team
Welcome to Week 2 of Obtaining Data! This will be the most lecture-intensive week of the course. The primary goal is to introduce you to the most common data storage systems and the appropriate tools to extract and clean data from those formats.
Remember that the course project is open and ongoing. With the skills you learn this week you should be able to start on the basic analyses that will form the beginnings of your project.
Good luck and have a great week!
Jeff Leek and the Data Science Track Team
Welcome to Week 3 of Obtaining Data! This week we will start to reduce the number of lectures so you can spend more time focusing on your course project. The lectures you do have will focus more on preparing data for downstream analysis and data collection. Your project is due at the end of this week.
If you have trouble or want to explore issues in more depth, please seek out answers on the message boards. They are a great resource! If you happen to be a superstar who already gets it, please take the time to help your classmates by answering their questions as well. This is one of the best ways to practice using and explaining your skills to others. These are two of the key characteristics of excellent data scientists.
Good luck and have a great week!
Jeff Leek and the Data Science Track Team
Welcome to Week 4 of Obtaining Data! In this final week we will focus on peer grading of assignments.
Participating in peer grading is an amazing learning opportunity. It gives you a chance to learn things from your fellow students, pick up tips for explaning key ideas, and helping others to learn as well. We have focused our effort on making the rubric as objective and straightforward to implement as possible. If you have any issues please report them in the forums as described in the syllabus.
Thanks again for all of your efforts in the course, we are in the last stretch. Good luck and have a great week!
Jeff Leek and the Data Science Track Team
Congratulations on finishing Obtaining Data!
We have set the grading and released the Statements of Accomplishment for the Course. It might take a few hours/days for the statements to be disbursed to accounts.
A couple of other notes:
- The course will begin again immediately starting in a couple of days. If you are still interested in keeping in touch with your fellow learners, please enroll in the new course and keep the conversation going. You may also be an invaluable resource for new course takers!
- Keep your eye on Hopkins offerings from Coursera. All announcements about future offerings will be posted at: https://twitter.com/jhubiostat and http://simplystatistics.org/, http://twitter.com/simplystats.
- If you liked this course, please consider taking some of the other course offerings through the Data Science Track. Now that you have the ability to collect and clean data, you are perfectly prepared to step into the real fun part of data science - answering questions with all the data you now have access to!
- If you have cool projects you created through the course, please Tweet them to either of the addresses above so we can see them!
Thanks again for all of your efforts during the course of the class and best of luck in your career!
Jeff Leek and the Data Science Track Team