Skip to content

Commit

Permalink
Merge pull request DataScienceSpecialization#11 from rdpeng/master
Browse files Browse the repository at this point in the history
More RR stuff
  • Loading branch information
jtleek committed Feb 13, 2014
2 parents a00aa9c + c0efdbf commit 0665a81
Show file tree
Hide file tree
Showing 9 changed files with 107 additions and 50 deletions.
36 changes: 36 additions & 0 deletions 02_RProgramming/functions/firstfunction.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
add2 <- function(x, y) {
x + y
}

above10 <- function(x) {
use <- x > 10
x[use]
}

above <- function(x, n) {
use <- x > n
x[use]
}

above <- function(x, n = 10) {
use <- x > n
x[use]
}

columnmean <- function(x) {
nc <- ncol(x)
means <- numeric(nc)
for(i in seq_len(nc)) {
means[i] <- mean(x[, i])
}
means
}

columnmean <- function(x, removeNA = TRUE) {
nc <- ncol(x)
means <- numeric(nc)
for(i in seq_len(nc)) {
means[i] <- mean(x[, i], na.rm = removeNA)
}
means
}
20 changes: 14 additions & 6 deletions 04_ExploratoryAnalysis/announcements.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,24 @@
## Week 1 Announcement
## Exploratory Data Analysis: Week 1

Welcome to Week 1 of Exploratory Data Analysis! This course
will focus on developing the tools and techniques for making preliminary examinations of data and to develop a "rough cut" analysis that can be the basis for more formal development.

[SOME STUFF]
I'm very excited to start Exploratory Data Analysis and I hope you are too. Exploratory data analysis (EDA) is a key element of data science because it allows you to develop a rough idea of what your data look like and what kinds of questions might be answered by them. EDA is often the "fun part" of data analysis, where you get to play around with the data and, well, explore!

Please see the course syllabus for information about the quizes, the project, due dates, and grading. Don't forget to introduce yourself on the message boards. The community developed around these courses is one of the key benefits of taking a MOOC and is one of the places to learn.
As of now the course web site on Coursera is open and you are free to start watching lecture videos, take the quizzes, and look at the first programming assignment. As you browse the course web site, please make sure to read through the <b>syllabus</b> which contains important information about the grading policy for quizzes and programming assignments as well as the course schedule.

The primary way to interact with me in this course is through the <b>discussion forums</b>. Here, you can start new threads by asking questions or you can respond to other people's questions. If you have a question about any aspect of the course, I strongly suggest that you search through the discussion boards first to see if anyone as already asked that question. If you see something similar to what you want to ask, you should up-vote that question using the up-arrow button rather than asking your question separately. The more votes a question or comment gets, the more likely it is that I will see it and be able to respond quickly. Of course, if you don't see a question similar to the one you want to ask, then you should definitely start a new thread on the appropriate forum.

This week will cover the basics of analytic graphics and the base plotting system in R. I recommend that you watch the videos in the order that they are listed on the web page, but watching the videos out of order isn't going to ruin the story. For each lecture video you can download a separate PDF document of the slides (the demonstration videos don't have slides associated with them).

Watching the videos on the Coursera web site is the best way to watch the lectures. However, there are alternative ways to view the lectures if that suits you. You can download the lecture video MP4 files and watch them locally on your computer.

I hope you enjoy the class. I anticipate a fun four weeks!

Roger Peng and the Data Science Team

Roger Peng and the Data Science Track Team

---


## Week 2 Announcement

Welcome to Week 2 of Prediction and Machine Learning! This will be the most lecture-intensive week of the course. The primary goal is to introduce you to how to build predictors in practice.
Expand Down
66 changes: 22 additions & 44 deletions 04_ExploratoryAnalysis/syllabus.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,7 @@
## Course Title

Exploratory Data Analysis

---

## Course Instructor(s)
## Exploratory Data Analysis

[Roger D. Peng](http://www.biostat.jhsph.edu/~rpeng/)


---

## Course Description
Expand All @@ -34,31 +27,25 @@ Lecture videos will be released weekly and will be available for the week and th

---

## Weekly quizzes

### Quiz 1

Assigned: Class open (1st of Month)
Due: 7th of the Month 12:00 AM UTC


### Quiz 2

Assigned: 8th of the Month 12:01 AM UTC
Due: 14th of the Month 12:00 AM UTC
## Assessments

### Quizzes

### Quiz 3
There will be one quiz every week. The quizzes will all open on the
first day of the course but they will be due weekly. So the Week 1
Quiz will be due at the end of the first week and the Week 2 Quiz will
be due at the end of the second week, etc.

Assigned: 15th of the Month 12:01 AM UTC
Due: 21st of the Month 12:00 AM UTC
Please refer to the individual weekly Quiz deadlines to see the exact
date and time that each Quiz is due.

### Peer Assessments

### Quiz 4

Assigned: 22nd of the Month 12:01 AM UTC
Due: 28th of the Month 12:00 AM UTC

The plotting assignments will be assessed via peer assessment. In
these assignments you will be asked to construct or reproduce certain
plots. You will be evaluated on the plot that you produce and the code
that you write to construct the plot. Assignments evaluted via peer
assessment will make use of your GitHub account.

---

Expand All @@ -71,13 +58,18 @@ Background lectures about the content of the course with respect to other quanti

## Quiz Scoring

You may attempt each quiz up to 2 times. Only the score from your final attempt will count toward your grade.
You may attempt each quiz up to 2 times. Only the score from your
final attempt will count toward your grade.

---

## Hard deadlines and soft deadlines

The reported due date is the soft deadline for each quiz. You may turn in quizzes up to two days after the soft deadline. The hard deadline is the Tuesday after the Quiz is due at 23:30 UTC-5:00\. Each day late will incur a 10% penalty, but if you use a late day, the penalty will not be applied to that day.
The reported due date is the soft deadline for each quiz. You may turn
in quizzes up to two days after the soft deadline. The hard deadline
is **two days** after the Quiz is due at 23:30 UTC. Each day late
will incur a 10% penalty, but if you use a late day, the penalty will
not be applied to that day.

---

Expand All @@ -87,20 +79,6 @@ You are permitted 5 late days for quizzes in the course. If you use a late day,

---

## Dates for the project

### Submission

Assigned: Class open (1st of Month)
Due: 21st of the Month 12:00 AM UTC

### Review

Assigned: 22nd of the Month 12:01 AM UTC
Due: 28th of the Month 12:00 AM UTC

---

## Typos

* We are prone to a typo or two - please report them and we will try to update the notes accordingly.
Expand Down
Binary file not shown.
Binary file not shown.
35 changes: 35 additions & 0 deletions 05_ReproducibleResearch/announcements.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
## Reproducible Research: Week 1

I'm very excited to start Reproducible Research and I hope you are too. Reproducible research is the idea that for any data analysis or research product, the data and the computer code that went into doing that analysis is available to others so that they might examine what you've done and reproduce your findings. This idea is becoming increasingly important in the era of "Big Data" where data analyses are become more and more complicated and difficult to describe in plain language. This course will cover some of the tools that you can use to make your data analyses reproducible.

As of now the course web site on Coursera is open and you are free to start watching lecture videos, take the quizzes, and look at the first programming assignment. As you browse the course web site, please make sure to read through the <b>syllabus</b> which contains important information about the grading policy for quizzes and programming assignments as well as the course schedule.

The primary way to interact with me in this course is through the <b>discussion forums</b>. Here, you can start new threads by asking questions or you can respond to other people's questions. If you have a question about any aspect of the course, I strongly suggest that you search through the discussion boards first to see if anyone as already asked that question. If you see something similar to what you want to ask, you should up-vote that question using the up-arrow button rather than asking your question separately. The more votes a question or comment gets, the more likely it is that I will see it and be able to respond quickly. Of course, if you don't see a question similar to the one you want to ask, then you should definitely start a new thread on the appropriate forum.

This week will cover the basic ideas of reproducible research since they may be unfamiliar to some of you. We also cover structuring and organizing a data analysis to help make it more reproducible. I recommend that you watch the videos in the order that they are listed on the web page, but watching the videos out of order isn't going to ruin the story. For each lecture video you can download a separate PDF document of the slides (the demonstration videos don't have slides associated with them).

Watching the videos on the Coursera web site is the best way to watch the lectures. However, there are alternative ways to view the lectures if that suits you. You can download the lecture video MP4 files and watch them locally on your computer.

I hope you enjoy the class. I anticipate a fun four weeks!

Roger Peng and the Data Science Team

---

## Reproducible Research: Week 2

Today marks the beginning of Week 2 of Reproducible Research. This week we cover some of the core tools for developing reproducible documents. We cover the literate programming tool knitr and show how to integrate it with Markdown to publish reproducible web documents. This week also introduces the first peer assessment which will require you to write up a reproducible data analysis using knitr.

Roger Peng and the Data Science Team

---

## Reproducible Research: Week 3


---

## Reproducible Research: Week 4


---
Binary file added 05_ReproducibleResearch/knitr/knitr.pdf
Binary file not shown.
Binary file modified 05_ReproducibleResearch/knitr/knitr.pptx
Binary file not shown.
Binary file modified 05_ReproducibleResearch/lectures/CaseStudy_AP.pdf
Binary file not shown.

0 comments on commit 0665a81

Please sign in to comment.