forked from Mixtape-Sessions/Shift-Share
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
8 changed files
with
749 additions
and
34 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -48,3 +48,4 @@ | |
|
||
## Word/Ppt Temp files | ||
**/~$* | ||
.Rproj.user |
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,191 @@ | ||
--- | ||
output: github_document | ||
--- | ||
|
||
```{r setup, include = F} | ||
# devtools::install_github("Hemken/Statamarkdown") | ||
library(Statamarkdown) | ||
``` | ||
|
||
|
||
# Mixtape SSIV Workshop: Coding Lab | ||
|
||
This lab will walk through some basic SSIV analyses using data from [Autor, Dorn, and Hanson](https://github.com/Mixtape-Sessions/Shift-Share/blob/main/Readings/Autor_Dorn_Hanson_2013) (ADH, 2013). As discussed in lecture, ADH use a shift-share instrument aggregating Chinese import shocks across 397 manufacturing instruments with exposure weights calculated as the economy-wide share of (baseline) industry employment. We will use three cleaned datasets from their setup: | ||
|
||
- `adh_shocks.dta`: an industry-by-year dataset of the shocks | ||
|
||
- `adh_shares.dta`: a location-by-industry-by-year dataset of the shares | ||
|
||
- `adh_noIV.dta`: a location-by-year dataset of the main outcome (manufacturing employment growth, `y`), treatment (local growth of China import exposure, `x`), and other useful variables -- excluding the ADH instrument | ||
|
||
|
||
## Exercises: | ||
|
||
1. Construct the ADH (location-by-year) instrument by appropriately combining the data on shocks and shares. Merge this into the `adh_noIV` dataset, and estimate an IV regression of the outcome onto the treatment which controls for year (i.e. the `post` variable) and weights by baseline total employment (the `weight` variable), clustering by `state`. Then estimate the exact same IV regression replacing the outcome `y` with the lagged outcome `y_lag`, capturing growth in manufacturing employment that took place before the ADH "China Shock" quasi-experiment. Comment on the difference in the two IV regression coefficients. | ||
|
||
```{r} | ||
## Example solutions for SSIV Lab | ||
## Written by Peter Hull & Kyle Butts | ||
## 5/16/2022 (v1) | ||
devtools::install_github("kylebutts/ssaggregate") | ||
library(tidyverse) # Data Cleaning | ||
library(fixest) # Regressions | ||
library(here) # Relative file paths | ||
library(haven) # Reading .dta | ||
library(ssaggregate) | ||
# Construct z | ||
adh_shares <- haven::read_dta(here("Lab/adh_shares.dta")) | ||
adh_shocks <- haven::read_dta(here("Lab/adh_shocks.dta")) | ||
df <- haven::read_dta(here("Lab/adh_noIV.dta")) | ||
df <- adh_shares |> | ||
# Merge shares with shocks | ||
left_join(adh_shocks, by=c("industry", "year")) |> | ||
# Create \sum (shock * share) | ||
mutate(z = ind_share * shock) |> | ||
group_by(location, year) |> | ||
summarize(z = sum(z)) |> | ||
# Merge with no IV dataset | ||
left_join(df, by = c("location", "year")) | ||
``` | ||
|
||
```{r} | ||
# Basic SSIV and balance | ||
df |> | ||
feols( | ||
y ~ post | 0 | x ~ z, weights = ~weight, cluster = ~state | ||
) | ||
``` | ||
|
||
```{r} | ||
df |> | ||
feols( | ||
y_lag ~ post | 0 | x ~ z, weights = ~weight, cluster = ~state | ||
) | ||
``` | ||
|
||
|
||
*Main IV Estimate:* | ||
*Standard Error:* | ||
|
||
*Lag Outcome IV Estimate:* | ||
*Standard Error:* | ||
|
||
*Comments:* | ||
|
||
|
||
|
||
2. Construct the "sum-of-shares" control from the `adh_shares` dataset and add this control to both of the previous IV regressions. Comment on how the main IV estimate changes. | ||
|
||
```{r} | ||
# Add sum of shares control | ||
df <- adh_shares |> | ||
group_by(location, year) |> | ||
summarize(sum_share = sum(ind_share)) |> | ||
left_join(df, by=c("location", "year")) | ||
summary(df$sum_share) | ||
``` | ||
|
||
```{r} | ||
# SSIV with Sum of Shares | ||
df |> | ||
feols( | ||
y ~ post + sum_share | 0 | x ~ z, weights = ~weight, cluster = ~state | ||
) | ||
``` | ||
|
||
```{r} | ||
# Balance Test with Sum of Shares | ||
df |> | ||
feols( | ||
y_lag ~ post + sum_share | 0 | x ~ z, weights = ~weight, cluster = ~state | ||
) | ||
``` | ||
|
||
*Main IV Estimate:* | ||
*Standard Error:* | ||
|
||
*Lag Outcome IV Estimate:* | ||
*Standard Error:* | ||
|
||
*Comments:* | ||
|
||
|
||
|
||
3. Interact the "sum-of-shares" control with year and add this control to both of the previous IV regressions. Comment on how both IV estimates change. Can you see why the interaction control is important? | ||
|
||
```{r} | ||
# SSIV with Sum of Shares x Year | ||
df |> | ||
feols( | ||
y ~ post + i(post, sum_share) | 0 | x ~ z, weights = ~weight, cluster = ~state | ||
) | ||
``` | ||
|
||
```{r} | ||
# Balance Test with Sum of Shares x Year | ||
df |> | ||
feols( | ||
y_lag ~ post + i(post, sum_share) | 0 | x ~ z, weights = ~weight, cluster = ~state | ||
) | ||
``` | ||
|
||
|
||
```{r} | ||
# Check why sum of shares matters | ||
adh_shocks |> | ||
feols(shock ~ i(year), cluster = ~industry) | ||
``` | ||
|
||
*Main IV Estimate:* | ||
*Standard Error:* | ||
|
||
*Lag Outcome IV Estimate:* | ||
*Standard Error:* | ||
|
||
*Comments:* | ||
|
||
4. Use the *ssaggregate* command to run both of the previous IV regressions at the shock level. You should control for year fixed effects in the shock-level IV regressions. The coefficients should be identical to the previous estimates, but the standard errors will be different. Comment on the change. | ||
|
||
```{r} | ||
# Get exposure-robust SEs with ssaggregate | ||
ssagg <- ssaggregate( | ||
data = df, | ||
vars = ~ y + x + y_lag, | ||
controls = ~ post + i(post, sum_share), | ||
weights = "weight", | ||
n = "industry", | ||
l = "location", | ||
t = "year", | ||
s = "ind_share", | ||
shares = adh_shares | ||
) | ||
ssagg_df <- ssagg |> | ||
left_join(adh_shocks, by = c("industry", "year")) | ||
``` | ||
|
||
```{r} | ||
ssagg_df |> | ||
feols(y ~ 1 | year | x ~ shock, weights = ~s_n, cluster = ~industry) | ||
``` | ||
|
||
```{r} | ||
ssagg_df |> | ||
feols(y_lag ~ 1 | year | x ~ shock, weights = ~s_n, cluster = ~industry) | ||
``` | ||
|
||
*Main IV Estimate:* | ||
*Standard Error:* | ||
|
||
*Lag Outcome IV Estimate:* | ||
*Standard Error:* | ||
|
||
*Comments:* | ||
|
||
|
Oops, something went wrong.