Problem Set 4: Dimension Reduction

Submit your assignment here (see workflow in the syllabus for help).

Remember to submit a single rendered PDF (either from .Rmd or a Jupyter Notebook) to this repo by Monday, November 11 at 5 pm.

For the following questions, use the world indicators data from class (countries.csv). Be sure to prepare the data appropriately (e.g., standardize).

Factor Analysis

How do CFA and EFA differ?
Fit three exploratory factor analysis models initialized at 2, 3, and 4 factors. Present the loadings from these solutions and discuss in substantive terms. How does each fit? What sense does this give you of the underlying dimensionality of the space? And so on.
Rotate the 3-factor solution using any oblique method you would like and present a visual of the unrotated and rotated versions side-by-side. How do these differ and why does this matter (or not)?

Principal Components Analysis

What is the statistical difference between PCA and FA? Describe the basic construction of each approach using equations and then point to differences that exist across these two widely used methods for reducing dimensionality.
Fit a PCA model. Present the proportion of explained variance across the first 10 components. What do these values tell you substantively (e.g., how many components likely characterize these data?)?
Present a biplot of the PCA fit from the previous question. Describe what you see (e.g., which countries are clustered together? Which input features are doing the bulk of the explaining? How do you know this?

Bonus Question (5 points):

Fit a sparse PCA model and a probabilistic PCA model. Compare these results substantively. What does each tell you and why do these distinctions matter in terms of inference (or not)?

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
PS4_DiTong.pdf		PS4_DiTong.pdf
README.md		README.md
countries.csv		countries.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Problem Set 4: Dimension Reduction

Factor Analysis

Principal Components Analysis

Bonus Question (5 points):

About

Releases

Packages

di-Tong/Problem-Set-4

Folders and files

Latest commit

History

Repository files navigation

Problem Set 4: Dimension Reduction

Factor Analysis

Principal Components Analysis

Bonus Question (5 points):

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages