Session 10
PMAP 8521: Program evaluation
Andrew Young School of Policy Studies
Arbitrary cutoffs and causal inference
Arbitrary cutoffs and causal inference
Drawing lines and measuring gaps
Arbitrary cutoffs and causal inference
Drawing lines and measuring gaps
Main RDD concerns
Instead of using carefully adjusted DAGs,
we can use context to isolate/identify the pathway between
treatment and outcome in observational data
Instead of using carefully adjusted DAGs,
we can use context to isolate/identify the pathway between
treatment and outcome in observational data
Diff-in-diff was one kind of quasi-experiment
Treatment/control + before/after
Instead of using carefully adjusted DAGs,
we can use context to isolate/identify the pathway between
treatment and outcome in observational data
Diff-in-diff was one kind of quasi-experiment
Treatment/control + before/after
Regression discontinuity designs (RDD) are another
Arbitrary rules determine access to programs
Lots of policies and programs are
based on arbitrary rules and thresholds
Lots of policies and programs are
based on arbitrary rules and thresholds
If you're above the threshold, you're in the program;
if you're below, you're not (or vice versa)
Running / forcing variable
Index or measure that determines eligibility
Running / forcing variable
Index or measure that determines eligibility
Cutoff / cutpoint / threshold
Number that formally assigns access to program
Size | Annual | Monthly | 138% | 150% | 200% |
---|---|---|---|---|---|
1 | $12,760 | $1,063 | $17,609 | $19,140 | $25,520 |
2 | $17,240 | $1,437 | $23,791 | $25,860 | $34,480 |
3 | $21,720 | $1,810 | $29,974 | $32,580 | $43,440 |
4 | $26,200 | $2,183 | $36,156 | $39,300 | $52,400 |
5 | $30,680 | $2,557 | $42,338 | $46,020 | $61,360 |
6 | $35,160 | $2,930 | $48,521 | $52,740 | $70,320 |
7 | $39,640 | $3,303 | $54,703 | $59,460 | $79,280 |
8 | $44,120 | $3,677 | $60,886 | $66,180 | $88,240 |
Medicaid
138%*
ACA subsidies
138–400%*
CHIP
200%
SNAP/Free lunch
130%
Reduced lunch
130–185%
Students take an entrance exam
Students take an entrance exam
Those who score 70 or lower
get a free tutor for the year
Students take an entrance exam
Those who score 70 or lower
get a free tutor for the year
Students then take an exit exam
at the end of the year
The people right before and right after the threshold are essentially the same
The people right before and right after the threshold are essentially the same
The people right before and right after the threshold are essentially the same
Pseudo treatment and control groups!
The people right before and right after the threshold are essentially the same
Pseudo treatment and control groups!
Compare outcomes for those
right before/after, calculate difference
Lower turnout in counties on the eastern side of the boundary
Election schedules cause fluctuations in turnout
California requires that insurance cover two days of post-partum hospitalization
Does extra time in the hospital improve health outcomes?
Delivering at 12:01 AM makes you stay longer in the hospital…
…but delivering at 12:01 AM has no effect on readmission rates or mortality rates
Does going to the main state university (e.g. UGA) make you earn more money?
SAT scores are an arbitrary cutoff for accessing the university
Cutoff seems rule-based
Cutoff seems rule-based
Earnings are slightly higher
People love these things!
People love these things!
They're intuitive, compelling, and highly graphical
People love these things!
They're intuitive, compelling, and highly graphical
RDD less susceptible to p-hacking and selective publication than DID or IV
Measure the gap in outcome for
people on both sides of the cutpoint
Measure the gap in outcome for
people on both sides of the cutpoint
Gap = δ =
local average treatment effect (LATE)
The size of the gap depends on how
you draw the lines on each side of the cutoff
The size of the gap depends on how
you draw the lines on each side of the cutoff
The type of lines you choose can
change the estimate of δ—sometimes by a lot!
The size of the gap depends on how
you draw the lines on each side of the cutoff
The type of lines you choose can
change the estimate of δ—sometimes by a lot!
There's no one right way to draw lines!
Parametric vs. non-parametric lines
Parametric vs. non-parametric lines
Measuring the gap
Parametric vs. non-parametric lines
Measuring the gap
Bandwidths
Parametric vs. non-parametric lines
Measuring the gap
Bandwidths
Kernels
Formulas with parameters
Formulas with parameters
y=mx+b
y=β0+β1x1+β2x2
y=10+4x
Not just for straight lines!
Make curvy with exponents or trigonometry
Not just for straight lines!
Make curvy with exponents or trigonometry
y=β0+β1x+β2x2+β3x7
y=β0+β1x+β2sin(x)
y=120−3x+0.07x2
y=300−25x+0.65x2−0.004x3
y=10+4x+50×sin(x4)
It's important to get the parameters right!
It's important to get the parameters right!
Line should fit the data pretty well
Lines without parameters
Lines without parameters
Use the data to find the best line,
often with windows and moving averages
Lines without parameters
Use the data to find the best line,
often with windows and moving averages
Locally estimated/weighted scatterplot smoothing (LOESS/LOWESS)
is a common method (but not the only one!)
y=who knows?
Easiest way: center the running variable around the threshold
id | exit_exam | entrance_exam | entrance_centered | tutoring |
---|---|---|---|---|
1 | 78 | 92 | 22 | FALSE |
2 | 58 | 73 | 3 | FALSE |
3 | 62 | 54 | -16 | TRUE |
4 | 67 | 98 | 28 | FALSE |
5 | 54 | 70 | 0 | TRUE |
y=β0+β1Running variable (centered)+β2Indicator for treatment
program_data <- tutoring %>% mutate(entrance_centered = entrance_exam - 70)model1 <- lm(exit_exam ~ entrance_centered + tutoring, data = program_data)
tidy(model1)
## # A tibble: 3 × 3## term estimate std.error## <chr> <dbl> <dbl>## 1 (Intercept) 59.3 0.440 ## 2 entrance_centered 0.514 0.0268## 3 tutoringTRUE 11.0 0.802
Can't use regression; use rdrobust
R package
rdrobust(y = tutoring$exit_exam, x = tutoring$entrance_exam, c = 70)
## =============================================================================## Method Coef. Std. Err. z P>|z| [ 95% C.I. ] ## =============================================================================## Conventional -9.992 1.708 -5.852 0.000 [-13.339 , -6.646] ## Robust - - -4.992 0.000 [-14.244 , -6.212] ## =============================================================================
All you really care about is the
area right around the cutoff
Observations far away don't matter
because they're not comparable
All you really care about is the
area right around the cutoff
Observations far away don't matter
because they're not comparable
Bandwidth = window around cutoff
Algorithms exist to choose optimal width
Algorithms exist to choose optimal width
Also use common sense
Maybe ±5 for the entrance exam?
Algorithms exist to choose optimal width
Also use common sense
Maybe ±5 for the entrance exam?
For robustness, check what happens
if you double and halve the bandwidth
Because we care the most about
observations right by the cutoff,
give more distant ones less weight
Because we care the most about
observations right by the cutoff,
give more distant ones less weight
Kernel = method for assigning importance to
observations based on distance to the cutoff
Your estimate of δ depends on all these:
Your estimate of δ depends on all these:
Line type (parametric vs. nonparametric)
Bandwidth (wide vs. narrow) Kernel weighting
Your estimate of δ depends on all these:
Line type (parametric vs. nonparametric)
Bandwidth (wide vs. narrow) Kernel weighting
Try lots of different combinations!
You need lots of data,
since you're throwing most of it away
You're only measuring the ATE
for people in the bandwidth
You're only measuring the ATE
for people in the bandwidth
Local Average Treatment Effect (LATE)
You can't make population-level
claims with a LATE
You can't make population-level
claims with a LATE
(But can you really do that with RCTs or diff-in-diff?)
You can't make population-level
claims with a LATE
(But can you really do that with RCTs or diff-in-diff?)
"The realistic conclusion to draw is that
all quantitative empirical results
that we encounter are 'local'"
Angrist and Pischke, Mostly Harmless Econometrics, pp. 23–24
Super clear breaks are uncommon
Make graphs,
but also find the
actual δ value
People might know about the cutoff
and change their behavior
People might know about the cutoff
and change their behavior
People might fudge numbers or work to
cross the threshold to get in/out of program
People might know about the cutoff
and change their behavior
People might fudge numbers or work to
cross the threshold to get in/out of program
If so, those right next to the cutoff are
no longer comparable treatment/control groups
Check with a McCrary density test
rddensity::rdplotdensity()
in R
People on the margin of the cutoff
might end up in/out of the program
People on the margin of the cutoff
might end up in/out of the program
The ACA, subsidies, Medicaid, and 138% of the poverty line
People on the margin of the cutoff
might end up in/out of the program
The ACA, subsidies, Medicaid, and 138% of the poverty line
Sharp vs. fuzzy discontinuities
Perfect compliance
Imperfect compliance
Address noncompliance with
instrumental variables
(more on this later!)
Address noncompliance with
instrumental variables
(more on this later!)
Use an instrument for which side
of the cutoff people should be on
Address noncompliance with
instrumental variables
(more on this later!)
Use an instrument for which side
of the cutoff people should be on
Effect is only for compliers near the cutoff
(complier LATE; doubly local effect)
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
o | Tile View: Overview of Slides |
Esc | Back to slideshow |
Session 10
PMAP 8521: Program evaluation
Andrew Young School of Policy Studies
Arbitrary cutoffs and causal inference
Arbitrary cutoffs and causal inference
Drawing lines and measuring gaps
Arbitrary cutoffs and causal inference
Drawing lines and measuring gaps
Main RDD concerns
Instead of using carefully adjusted DAGs,
we can use context to isolate/identify the pathway between
treatment and outcome in observational data
Instead of using carefully adjusted DAGs,
we can use context to isolate/identify the pathway between
treatment and outcome in observational data
Diff-in-diff was one kind of quasi-experiment
Treatment/control + before/after
Instead of using carefully adjusted DAGs,
we can use context to isolate/identify the pathway between
treatment and outcome in observational data
Diff-in-diff was one kind of quasi-experiment
Treatment/control + before/after
Regression discontinuity designs (RDD) are another
Arbitrary rules determine access to programs
Lots of policies and programs are
based on arbitrary rules and thresholds
Lots of policies and programs are
based on arbitrary rules and thresholds
If you're above the threshold, you're in the program;
if you're below, you're not (or vice versa)
Running / forcing variable
Index or measure that determines eligibility
Running / forcing variable
Index or measure that determines eligibility
Cutoff / cutpoint / threshold
Number that formally assigns access to program
Size | Annual | Monthly | 138% | 150% | 200% |
---|---|---|---|---|---|
1 | $12,760 | $1,063 | $17,609 | $19,140 | $25,520 |
2 | $17,240 | $1,437 | $23,791 | $25,860 | $34,480 |
3 | $21,720 | $1,810 | $29,974 | $32,580 | $43,440 |
4 | $26,200 | $2,183 | $36,156 | $39,300 | $52,400 |
5 | $30,680 | $2,557 | $42,338 | $46,020 | $61,360 |
6 | $35,160 | $2,930 | $48,521 | $52,740 | $70,320 |
7 | $39,640 | $3,303 | $54,703 | $59,460 | $79,280 |
8 | $44,120 | $3,677 | $60,886 | $66,180 | $88,240 |
Medicaid
138%*
ACA subsidies
138–400%*
CHIP
200%
SNAP/Free lunch
130%
Reduced lunch
130–185%
Students take an entrance exam
Students take an entrance exam
Those who score 70 or lower
get a free tutor for the year
Students take an entrance exam
Those who score 70 or lower
get a free tutor for the year
Students then take an exit exam
at the end of the year
The people right before and right after the threshold are essentially the same
The people right before and right after the threshold are essentially the same
The people right before and right after the threshold are essentially the same
Pseudo treatment and control groups!
The people right before and right after the threshold are essentially the same
Pseudo treatment and control groups!
Compare outcomes for those
right before/after, calculate difference
Lower turnout in counties on the eastern side of the boundary
Election schedules cause fluctuations in turnout
California requires that insurance cover two days of post-partum hospitalization
Does extra time in the hospital improve health outcomes?
Delivering at 12:01 AM makes you stay longer in the hospital…
…but delivering at 12:01 AM has no effect on readmission rates or mortality rates
Does going to the main state university (e.g. UGA) make you earn more money?
SAT scores are an arbitrary cutoff for accessing the university
Cutoff seems rule-based
Cutoff seems rule-based
Earnings are slightly higher
People love these things!
People love these things!
They're intuitive, compelling, and highly graphical
People love these things!
They're intuitive, compelling, and highly graphical
RDD less susceptible to p-hacking and selective publication than DID or IV
Measure the gap in outcome for
people on both sides of the cutpoint
Measure the gap in outcome for
people on both sides of the cutpoint
Gap = δ =
local average treatment effect (LATE)
The size of the gap depends on how
you draw the lines on each side of the cutoff
The size of the gap depends on how
you draw the lines on each side of the cutoff
The type of lines you choose can
change the estimate of δ—sometimes by a lot!
The size of the gap depends on how
you draw the lines on each side of the cutoff
The type of lines you choose can
change the estimate of δ—sometimes by a lot!
There's no one right way to draw lines!
Parametric vs. non-parametric lines
Parametric vs. non-parametric lines
Measuring the gap
Parametric vs. non-parametric lines
Measuring the gap
Bandwidths
Parametric vs. non-parametric lines
Measuring the gap
Bandwidths
Kernels
Formulas with parameters
Formulas with parameters
y=mx+b
y=β0+β1x1+β2x2
y=10+4x
Not just for straight lines!
Make curvy with exponents or trigonometry
Not just for straight lines!
Make curvy with exponents or trigonometry
y=β0+β1x+β2x2+β3x7
y=β0+β1x+β2sin(x)
y=120−3x+0.07x2
y=300−25x+0.65x2−0.004x3
y=10+4x+50×sin(x4)
It's important to get the parameters right!
It's important to get the parameters right!
Line should fit the data pretty well
Lines without parameters
Lines without parameters
Use the data to find the best line,
often with windows and moving averages
Lines without parameters
Use the data to find the best line,
often with windows and moving averages
Locally estimated/weighted scatterplot smoothing (LOESS/LOWESS)
is a common method (but not the only one!)
y=who knows?
Easiest way: center the running variable around the threshold
id | exit_exam | entrance_exam | entrance_centered | tutoring |
---|---|---|---|---|
1 | 78 | 92 | 22 | FALSE |
2 | 58 | 73 | 3 | FALSE |
3 | 62 | 54 | -16 | TRUE |
4 | 67 | 98 | 28 | FALSE |
5 | 54 | 70 | 0 | TRUE |
y=β0+β1Running variable (centered)+β2Indicator for treatment
program_data <- tutoring %>% mutate(entrance_centered = entrance_exam - 70)model1 <- lm(exit_exam ~ entrance_centered + tutoring, data = program_data)
tidy(model1)
## # A tibble: 3 × 3## term estimate std.error## <chr> <dbl> <dbl>## 1 (Intercept) 59.3 0.440 ## 2 entrance_centered 0.514 0.0268## 3 tutoringTRUE 11.0 0.802
Can't use regression; use rdrobust
R package
rdrobust(y = tutoring$exit_exam, x = tutoring$entrance_exam, c = 70)
## =============================================================================## Method Coef. Std. Err. z P>|z| [ 95% C.I. ] ## =============================================================================## Conventional -9.992 1.708 -5.852 0.000 [-13.339 , -6.646] ## Robust - - -4.992 0.000 [-14.244 , -6.212] ## =============================================================================
All you really care about is the
area right around the cutoff
Observations far away don't matter
because they're not comparable
All you really care about is the
area right around the cutoff
Observations far away don't matter
because they're not comparable
Bandwidth = window around cutoff
Algorithms exist to choose optimal width
Algorithms exist to choose optimal width
Also use common sense
Maybe ±5 for the entrance exam?
Algorithms exist to choose optimal width
Also use common sense
Maybe ±5 for the entrance exam?
For robustness, check what happens
if you double and halve the bandwidth
Because we care the most about
observations right by the cutoff,
give more distant ones less weight
Because we care the most about
observations right by the cutoff,
give more distant ones less weight
Kernel = method for assigning importance to
observations based on distance to the cutoff
Your estimate of δ depends on all these:
Your estimate of δ depends on all these:
Line type (parametric vs. nonparametric)
Bandwidth (wide vs. narrow) Kernel weighting
Your estimate of δ depends on all these:
Line type (parametric vs. nonparametric)
Bandwidth (wide vs. narrow) Kernel weighting
Try lots of different combinations!
You need lots of data,
since you're throwing most of it away
You're only measuring the ATE
for people in the bandwidth
You're only measuring the ATE
for people in the bandwidth
Local Average Treatment Effect (LATE)
You can't make population-level
claims with a LATE
You can't make population-level
claims with a LATE
(But can you really do that with RCTs or diff-in-diff?)
You can't make population-level
claims with a LATE
(But can you really do that with RCTs or diff-in-diff?)
"The realistic conclusion to draw is that
all quantitative empirical results
that we encounter are 'local'"
Angrist and Pischke, Mostly Harmless Econometrics, pp. 23–24
Super clear breaks are uncommon
Make graphs,
but also find the
actual δ value
People might know about the cutoff
and change their behavior
People might know about the cutoff
and change their behavior
People might fudge numbers or work to
cross the threshold to get in/out of program
People might know about the cutoff
and change their behavior
People might fudge numbers or work to
cross the threshold to get in/out of program
If so, those right next to the cutoff are
no longer comparable treatment/control groups
Check with a McCrary density test
rddensity::rdplotdensity()
in R
People on the margin of the cutoff
might end up in/out of the program
People on the margin of the cutoff
might end up in/out of the program
The ACA, subsidies, Medicaid, and 138% of the poverty line
People on the margin of the cutoff
might end up in/out of the program
The ACA, subsidies, Medicaid, and 138% of the poverty line
Sharp vs. fuzzy discontinuities
Perfect compliance
Imperfect compliance
Address noncompliance with
instrumental variables
(more on this later!)
Address noncompliance with
instrumental variables
(more on this later!)
Use an instrument for which side
of the cutoff people should be on
Address noncompliance with
instrumental variables
(more on this later!)
Use an instrument for which side
of the cutoff people should be on
Effect is only for compliers near the cutoff
(complier LATE; doubly local effect)