Skip to content

learningsam20/datasciencecoursera

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Data Science Coursera - Data cleansing

Files present

It consists of the following files:

  • run_analysis.R would take the input file as UCI HAR datasets and perform requisite analysis
  • readme.md would provide an overview of the processing
  • codebook.md would provide an overview of the variables in the output file i.e. tidy dataset

Analysis performed

Here is a sequence of steps performed for analysis:

  1. Read train data, X and Y variables along with subject data
  2. Read test data, X and Y variables along with subject data
  3. Read the feature names along with the activity labels
  4. Coerce this data into a single dataset and retain only those columns that have std or mean in their names
  5. Aggregate the data by grouping it by activity and subject and calculate mean by this grouping for all the variables
  6. Finally, write out the tidy dataset to a text file

About

This is repo consisting of R code for data cleansing

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages