Package 'rddtools'

Title:	Toolbox for Regression Discontinuity Design ('RDD')
Description:	Set of functions for Regression Discontinuity Design ('RDD'), for data visualisation, estimation and testing.
Authors:	Matthieu Stigler [aut] , Bastiaan Quast [aut, cre]
Maintainer:	Bastiaan Quast <[email protected]>
License:	GPL (>= 2)
Version:	2.0.0
Built:	2025-03-20 04:01:51 UTC
Source:	https://github.com/bquast/rddtools

Help Index

Convert a rdd object to lm
Convert an rdd_reg object to a npreg object
Post-inference for clustered data
Testing for balanced covariates: equality of distribution
Testing for balanced covariates: equality of means with t-test
Run the McCracy test for manipulation of the forcing variable
Generate Monte Carlo simulations of Imbens and Kalyanaraman
Dataset used in Lee (2008)
INDH data set
Plot rdd_data
Bin plotting
Draw a (density) plot of placebo tests
Plot the sensitivity to the bandwidth
Bandwidth selection for Regression Discontinuity estimators, CTT 2014
Bandwidth selection for Regression Discontinuity visualisation, CTT 2015
Imbens-Kalyanaraman Optimal Bandwidth Calculation
Global bandwidth selector of Ruppert, Sheather and Wand (1995) from package KernSmooth
RDD coefficient
Construct rdd_data
General polynomial estimator of the regression discontinuity
RDD coefficient prediction
Parametric polynomial estimator of the regression discontinuity
Parametric polynomial estimator of the regression discontinuity
Regression Discontinuity Design
Bandwidth selector
Transformation of the STAR dataset as used in Angrist and Pischke (2008)
Cluster Heteroskedasticity-consistent estimation of the covariance matrix.
Confint allowing vcov

Convert a rdd object to lm

Description

Convert a rdd object to lm

Usage

as.lm(x)
as.lm(x)

Arguments

`x`	An object to convert to lm

Value

An object of class lm

Examples

data(house)
house_rdd <- rdd_data(y=house$y, x=house$x, cutpoint=0)
reg_para <- rdd_reg_lm(rdd_object=house_rdd)
reg_para_lm <- as.lm(reg_para)
reg_para_lm
plot(reg_para_lm, which=4)
data(house)
house_rdd <- rdd_data(y=house$y, x=house$x, cutpoint=0)
reg_para <- rdd_reg_lm(rdd_object=house_rdd)
reg_para_lm <- as.lm(reg_para)
reg_para_lm
plot(reg_para_lm, which=4)

Convert an rdd_reg object to a `npreg` object

Description

Convert an rdd_object to a non-parametric regression npreg from package np

Usage

as.npregbw(x, ...)

as.npreg(x, ...)
as.npregbw(x, ...)

as.npreg(x, ...)

Arguments

`x`	Object of class `rdd_reg` created by `rdd_reg_np` or `rdd_reg_lm`
`...`	Further arguments passed to the `npregbw` or `npreg`

Details

This function converts an rdd_reg object into an npreg object from package np Note that the output won't be the same, since npreg does not offer a triangular kernel, but a Gaussian or Epanechinkov one. Another reason why estimates might differ slightly is that npreg implements a multivariate kernel, while rdd_reg proceeds as if the kernel was univariate. A simple solution to make the multivariate kernel similar to the univariate one is to set the bandwidth for x and Dx to a large number, so that they converge towards a constant, and one obtains back the univariate kernel.

Value

An object of class npreg or npregbw

Examples

# Estimate ususal rdd_reg:
 data(house)
 house_rdd <- rdd_data(y=house$y, x=house$x, cutpoint=0)
 reg_nonpara <- rdd_reg_np(rdd_object=house_rdd)

## Convert to npreg:
 reg_nonpara_np <- as.npreg(reg_nonpara)
 reg_nonpara_np
 rdd_coef(reg_nonpara_np, allCo=TRUE, allInfo=TRUE)

## Compare with result obtained with a Gaussian kernel:
 bw_lm <- dnorm(house_rdd$x, sd=rddtools:::getBW(reg_nonpara))
 reg_nonpara_gaus <- rdd_reg_lm(rdd_object=house_rdd, w=bw_lm)
 all.equal(rdd_coef(reg_nonpara_gaus),rdd_coef(reg_nonpara_np)) 
# Estimate ususal rdd_reg:
 data(house)
 house_rdd <- rdd_data(y=house$y, x=house$x, cutpoint=0)
 reg_nonpara <- rdd_reg_np(rdd_object=house_rdd)

## Convert to npreg:
 reg_nonpara_np <- as.npreg(reg_nonpara)
 reg_nonpara_np
 rdd_coef(reg_nonpara_np, allCo=TRUE, allInfo=TRUE)

## Compare with result obtained with a Gaussian kernel:
 bw_lm <- dnorm(house_rdd$x, sd=rddtools:::getBW(reg_nonpara))
 reg_nonpara_gaus <- rdd_reg_lm(rdd_object=house_rdd, w=bw_lm)
 all.equal(rdd_coef(reg_nonpara_gaus),rdd_coef(reg_nonpara_np))

Post-inference for clustered data

Description

Correct standard-errors to account for clustered data, doing either a degrees of freedom correction or using a heteroskedasticidty-cluster robust covariance matrix possibly on the range specified by bandwidth

Usage

clusterInf(object, clusterVar, vcov. = NULL, type = c("df-adj", "HC"), ...)
clusterInf(object, clusterVar, vcov. = NULL, type = c("df-adj", "HC"), ...)

Arguments

`object`	Object of class lm, from which rdd_reg also inherits.
`clusterVar`	The variable containing the cluster attributions.
`vcov.`	Specific covariance function to pass to coeftest. See help of sandwich
`type`	The type of cluster correction to use: either the degrees of freedom, or a HC matrix.
`...`	Further arguments passed to coeftest

Value

The output of the coeftest function, which is itself of class coeftest

References

Wooldridge (2003) Cluster-sample methods in applied econometrics. AmericanEconomic Review, 93, p. 133-138

Examples

data(house)
house_rdd <- rdd_data(y=house$y, x=house$x, cutpoint=0)
reg_para <- rdd_reg_lm(rdd_object=house_rdd)

# here we just generate randomly a cluster variable:
nlet <- sort(c(outer(letters, letters, paste, sep='')))
clusRandom <- sample(nlet[1:60], size=nrow(house_rdd), replace=TRUE)

# now do post-inference:
clusterInf(reg_para, clusterVar=clusRandom)
clusterInf(reg_para, clusterVar=clusRandom, type='HC')
data(house)
house_rdd <- rdd_data(y=house$y, x=house$x, cutpoint=0)
reg_para <- rdd_reg_lm(rdd_object=house_rdd)

# here we just generate randomly a cluster variable:
nlet <- sort(c(outer(letters, letters, paste, sep='')))
clusRandom <- sample(nlet[1:60], size=nrow(house_rdd), replace=TRUE)

# now do post-inference:
clusterInf(reg_para, clusterVar=clusRandom)
clusterInf(reg_para, clusterVar=clusRandom, type='HC')

Testing for balanced covariates: equality of distribution

Description

Tests equality of distribution with a Kolmogorov-Smirnov for each covariates, between the two full groups or around the discontinuity threshold

Usage

covarTest_dis(
  object,
  bw,
  exact = NULL,
  p.adjust = c("none", "holm", "BH", "BY", "hochberg", "hommel", "bonferroni")
)

## S3 method for class 'rdd_data'
covarTest_dis(
  object,
  bw = NULL,
  exact = FALSE,
  p.adjust = c("none", "holm", "BH", "BY", "hochberg", "hommel", "bonferroni")
)

## S3 method for class 'rdd_reg'
covarTest_dis(
  object,
  bw = NULL,
  exact = FALSE,
  p.adjust = c("none", "holm", "BH", "BY", "hochberg", "hommel", "bonferroni")
)
covarTest_dis(
  object,
  bw,
  exact = NULL,
  p.adjust = c("none", "holm", "BH", "BY", "hochberg", "hommel", "bonferroni")
)

## S3 method for class 'rdd_data'
covarTest_dis(
  object,
  bw = NULL,
  exact = FALSE,
  p.adjust = c("none", "holm", "BH", "BY", "hochberg", "hommel", "bonferroni")
)

## S3 method for class 'rdd_reg'
covarTest_dis(
  object,
  bw = NULL,
  exact = FALSE,
  p.adjust = c("none", "holm", "BH", "BY", "hochberg", "hommel", "bonferroni")
)

Arguments

`object`	object of class rdd_data
`bw`	a bandwidth
`exact`	Argument of the `ks.test` function: NULL or a logical indicating whether an exact p-value should be computed.
`p.adjust`	Whether to adjust the p-values for multiple testing. Uses the `p.adjust` function

Value

A data frame with, for each covariate, the K-S statistic and its p-value.

Author(s)

Matthieu Stigler <[email protected]>

Examples

data(house)

## Add randomly generated covariates
set.seed(123)
n_Lee <- nrow(house)
Z <- data.frame(z1 = rnorm(n_Lee, sd=2), 
                z2 = rnorm(n_Lee, mean = ifelse(house<0, 5, 8)), 
                z3 = sample(letters, size = n_Lee, replace = TRUE))
house_rdd_Z <- rdd_data(y = house$y, x = house$x, covar = Z, cutpoint = 0)

## Kolmogorov-Smirnov test of equality in distribution:
covarTest_dis(house_rdd_Z, bw=0.3)

## Can also use function covarTest_dis() for a t-test for equality of means around cutoff:
covarTest_mean(house_rdd_Z, bw=0.3)
## covarTest_dis works also on regression outputs (bw will be taken from the model)
reg_nonpara <- rdd_reg_np(rdd_object=house_rdd_Z)
covarTest_dis(reg_nonpara)
data(house)

## Add randomly generated covariates
set.seed(123)
n_Lee <- nrow(house)
Z <- data.frame(z1 = rnorm(n_Lee, sd=2), 
                z2 = rnorm(n_Lee, mean = ifelse(house<0, 5, 8)), 
                z3 = sample(letters, size = n_Lee, replace = TRUE))
house_rdd_Z <- rdd_data(y = house$y, x = house$x, covar = Z, cutpoint = 0)

## Kolmogorov-Smirnov test of equality in distribution:
covarTest_dis(house_rdd_Z, bw=0.3)

## Can also use function covarTest_dis() for a t-test for equality of means around cutoff:
covarTest_mean(house_rdd_Z, bw=0.3)
## covarTest_dis works also on regression outputs (bw will be taken from the model)
reg_nonpara <- rdd_reg_np(rdd_object=house_rdd_Z)
covarTest_dis(reg_nonpara)

Testing for balanced covariates: equality of means with t-test

Description

Tests equality of means by a t-test for each covariate, between the two full groups or around the discontinuity threshold

Usage

covarTest_mean(
  object,
  bw = NULL,
  paired = FALSE,
  var.equal = FALSE,
  p.adjust = c("none", "holm", "BH", "BY", "hochberg", "hommel", "bonferroni")
)

## S3 method for class 'rdd_data'
covarTest_mean(
  object,
  bw = NULL,
  paired = FALSE,
  var.equal = FALSE,
  p.adjust = c("none", "holm", "BH", "BY", "hochberg", "hommel", "bonferroni")
)

## S3 method for class 'rdd_reg'
covarTest_mean(
  object,
  bw = NULL,
  paired = FALSE,
  var.equal = FALSE,
  p.adjust = c("none", "holm", "BH", "BY", "hochberg", "hommel", "bonferroni")
)
covarTest_mean(
  object,
  bw = NULL,
  paired = FALSE,
  var.equal = FALSE,
  p.adjust = c("none", "holm", "BH", "BY", "hochberg", "hommel", "bonferroni")
)

## S3 method for class 'rdd_data'
covarTest_mean(
  object,
  bw = NULL,
  paired = FALSE,
  var.equal = FALSE,
  p.adjust = c("none", "holm", "BH", "BY", "hochberg", "hommel", "bonferroni")
)

## S3 method for class 'rdd_reg'
covarTest_mean(
  object,
  bw = NULL,
  paired = FALSE,
  var.equal = FALSE,
  p.adjust = c("none", "holm", "BH", "BY", "hochberg", "hommel", "bonferroni")
)

Arguments

`object`	object of class rdd_data
`bw`	a bandwidth
`paired`	Argument of the `t.test` function: logical indicating whether you want paired t-tests.
`var.equal`	Argument of the `t.test` function: logical variable indicating whether to treat the two variances as being equal
`p.adjust`	Whether to adjust the p-values for multiple testing. Uses the `p.adjust` function

Value

A data frame with, for each covariate, the mean on each size, the difference, t-stat and ts p-value.

Author(s)

Matthieu Stigler <[email protected]>

Examples

data(house)

## Add randomly generated covariates
set.seed(123)
n_Lee <- nrow(house)
Z <- data.frame(z1 = rnorm(n_Lee, sd=2), 
                z2 = rnorm(n_Lee, mean = ifelse(house<0, 5, 8)), 
                z3 = sample(letters, size = n_Lee, replace = TRUE))
house_rdd_Z <- rdd_data(y = house$y, x = house$x, covar = Z, cutpoint = 0)

## test for equality of means around cutoff:
covarTest_mean(house_rdd_Z, bw=0.3)

## Can also use function covarTest_dis() for Kolmogorov-Smirnov test:
covarTest_dis(house_rdd_Z, bw=0.3)

## covarTest_mean works also on regression outputs (bw will be taken from the model)
reg_nonpara <- rdd_reg_np(rdd_object=house_rdd_Z)
covarTest_mean(reg_nonpara)
data(house)

## Add randomly generated covariates
set.seed(123)
n_Lee <- nrow(house)
Z <- data.frame(z1 = rnorm(n_Lee, sd=2), 
                z2 = rnorm(n_Lee, mean = ifelse(house<0, 5, 8)), 
                z3 = sample(letters, size = n_Lee, replace = TRUE))
house_rdd_Z <- rdd_data(y = house$y, x = house$x, covar = Z, cutpoint = 0)

## test for equality of means around cutoff:
covarTest_mean(house_rdd_Z, bw=0.3)

## Can also use function covarTest_dis() for Kolmogorov-Smirnov test:
covarTest_dis(house_rdd_Z, bw=0.3)

## covarTest_mean works also on regression outputs (bw will be taken from the model)
reg_nonpara <- rdd_reg_np(rdd_object=house_rdd_Z)
covarTest_mean(reg_nonpara)

Run the McCracy test for manipulation of the forcing variable

Description

Calls the DCdensity test from package rdd on a rdd_object.

Usage

dens_test(rdd_object, bin = NULL, bw = NULL, plot = TRUE, ...)
dens_test(rdd_object, bin = NULL, bw = NULL, plot = TRUE, ...)

Arguments

`rdd_object`	object of class rdd_data
`bin`	Argument of the `DCdensity` function, the binwidth
`bw`	Argument of the `DCdensity` function, the bandwidth
`plot`	Whether to return a plot. Logical, default ot TRUE.
`...`	Further arguments passed to `DCdensity`.

Examples

data(house)
house_rdd <- rdd_data(y=house$y, x=house$x, cutpoint=0)
dens_test(house_rdd)
data(house)
house_rdd <- rdd_data(y=house$y, x=house$x, cutpoint=0)
dens_test(house_rdd)

Generate Monte Carlo simulations of Imbens and Kalyanaraman

Description

Generate the simulations reported in Imbens and Kalyanaraman (2012)

Usage

gen_mc_ik(
  n = 200,
  version = 1,
  sd = 0.1295,
  output = c("data.frame", "rdd_data"),
  size
)
gen_mc_ik(
  n = 200,
  version = 1,
  sd = 0.1295,
  output = c("data.frame", "rdd_data"),
  size
)

Arguments

`n`	The size of sampel to generate
`version`	The MC version of Imbens and Kalnayaraman (between 1 and 4).
`sd`	The standard deviation of the error term.
`output`	Whether to return a data-frame, or already a rdd_data
`size`	The size of the effect, this depends on the specific version, defaults are as in ik: 0.04, NULL, 0.1, 0.1

Value

An data frame with x and y variables.

Examples

mc1_dat <- gen_mc_ik()
MC1_rdd <- rdd_data(y=mc1_dat$y, x=mc1_dat$x, cutpoint=0)

## Use np regression:
reg_nonpara <- rdd_reg_np(rdd_object=MC1_rdd)
reg_nonpara

# Represent the curves:
plotCu <- function(version=1, xlim=c(-0.1,0.1)){
  res <- gen_mc_ik(sd=0.0000001, n=1000, version=version)
  res <- res[order(res$x),]
  ylim <- range(subset(res, x>=min(xlim) & x<=max(xlim), 'y'))
  plot(res, type='l', xlim=xlim, ylim=ylim, main=paste('DGP', version))
  abline(v=0)
  xCut <- res[which(res$x==min(res$x[res$x>=0]))+c(0,-1),]
  points(xCut, col=2)
}
layout(matrix(1:4,2, byrow=TRUE))
plotCu(version=1)
plotCu(version=2)
plotCu(version=3)
plotCu(version=4)
layout(matrix(1))
mc1_dat <- gen_mc_ik()
MC1_rdd <- rdd_data(y=mc1_dat$y, x=mc1_dat$x, cutpoint=0)

## Use np regression:
reg_nonpara <- rdd_reg_np(rdd_object=MC1_rdd)
reg_nonpara

# Represent the curves:
plotCu <- function(version=1, xlim=c(-0.1,0.1)){
  res <- gen_mc_ik(sd=0.0000001, n=1000, version=version)
  res <- res[order(res$x),]
  ylim <- range(subset(res, x>=min(xlim) & x<=max(xlim), 'y'))
  plot(res, type='l', xlim=xlim, ylim=ylim, main=paste('DGP', version))
  abline(v=0)
  xCut <- res[which(res$x==min(res$x[res$x>=0]))+c(0,-1),]
  points(xCut, col=2)
}
layout(matrix(1:4,2, byrow=TRUE))
plotCu(version=1)
plotCu(version=2)
plotCu(version=3)
plotCu(version=4)
layout(matrix(1))

Dataset used in Lee (2008)

Description

Randomized experiments from non-random selection in U.S. House elections

Dataset described used in Imbens and Kalyamaran (2012), and probably the same dataset used in Lee (2008),

Format

A data frame with 6558 observations and two variables:

x: Vote at election t-1
y: Vote at election t

Source

Guido Imbens webpage: https://scholar.harvard.edu/imbens/scholar_software/regression-discontinuity

References

Imbens, Guido and Karthik Kalyanaraman. (2012) 'Optimal Bandwidth Choice for the regression discontinuity estimator,' Review of Economic Studies (2012) 79, 933-959

Lee, D. (2008) Randomized experiments from non-random selection in U.S. House elections, Journal of Econometrics, 142, 675-697

Examples

data(house)
rdd_house <- rdd_data(x=x, y=y, data=house, cutpoint=0)
summary(rdd_house)
plot(rdd_house)
data(house)
rdd_house <- rdd_data(x=x, y=y, data=house, cutpoint=0)
summary(rdd_house)
plot(rdd_house)

INDH data set

Description

Data from the Initiative Nationale du Development Humaine, collected as the part of the SNSF project "Development Aid and Social Dynamics"

Format

A data frame with two variables with 720 observations each

References

Arcand, Rieger, and Nguyen (2015) 'Development Aid and Social Dyanmics Data Set'

Examples

# load the data
data(indh)

# construct rdd_data frame
rdd_dat_indh <- rdd_data(y=choice_pg, x=poverty, data=indh, cutpoint=30)

# inspect data frame
summary(rdd_dat_indh)

# perform non-parametric regression
( reg_np_indh <- rdd_reg_np(rdd_dat_indh) )
# load the data
data(indh)

# construct rdd_data frame
rdd_dat_indh <- rdd_data(y=choice_pg, x=poverty, data=indh, cutpoint=30)

# inspect data frame
summary(rdd_dat_indh)

# perform non-parametric regression
( reg_np_indh <- rdd_reg_np(rdd_dat_indh) )

Plot rdd_data

Description

Binned plot of the forcing and outcome variable

Usage

## S3 method for class 'rdd_data'
plot(
  x,
  h = NULL,
  nbins = NULL,
  xlim = range(object$x, na.rm = TRUE),
  cex = 0.7,
  nplot = 1,
  device = c("base", "ggplot"),
  ...
)
## S3 method for class 'rdd_data'
plot(
  x,
  h = NULL,
  nbins = NULL,
  xlim = range(object$x, na.rm = TRUE),
  cex = 0.7,
  nplot = 1,
  device = c("base", "ggplot"),
  ...
)

Arguments

`x`	Object of class rdd_data
`h`	The binwidth parameter (note this differs from the bandwidth parameter!)
`nbins`	Alternative to h, the total number of bins in the plot.
`xlim`	The range of the x data
`cex`	Size of the points, see `par`
`nplot`	Number of plot to draw
`device`	Type of device used. Currently not used.
`...`	Further arguments passed to the `plot` function.

Details

Produces a simple binned plot averaging values within each interval. The length of the intervals is specified with the argument h, specifying the whole binwidth (contrary to the usual bandwidth argument, that gives half of the length of the kernel window. When no bandwidth is given, the bandwidth of Ruppert et al is used, see rdd_bw_rsw.

Value

A plot

Author(s)

Matthieu Stigler <[email protected]>

Examples

data(house)
house_rdd <- rdd_data(y=house$y, x=house$x, cutpoint=0)
plot(house_rdd)

## Specify manually the bandwidth:
plot(house_rdd, h=0.2)

## Show three plots with different bandwidth:
plot(house_rdd, h=c(0.2,0.3,0.4), nplot=3)

## Specify instead of the bandwidth, the final number of bins:
plot(house_rdd, nbins=22)

## If the specified number of bins is odd, the larger number is given to side with largest range
plot(house_rdd, nbins=21)
data(house)
house_rdd <- rdd_data(y=house$y, x=house$x, cutpoint=0)
plot(house_rdd)

## Specify manually the bandwidth:
plot(house_rdd, h=0.2)

## Show three plots with different bandwidth:
plot(house_rdd, h=c(0.2,0.3,0.4), nplot=3)

## Specify instead of the bandwidth, the final number of bins:
plot(house_rdd, nbins=22)

## If the specified number of bins is odd, the larger number is given to side with largest range
plot(house_rdd, nbins=21)

Bin plotting

Description

Do a 'scatterplot bin smoothing'

Usage

plotBin(
  x,
  y,
  h = NULL,
  nbins = NULL,
  cutpoint = 0,
  plot = TRUE,
  type = c("value", "number"),
  xlim = range(x, na.rm = TRUE),
  cex = 0.9,
  main = NULL,
  xlab,
  ylab,
  ...
)
plotBin(
  x,
  y,
  h = NULL,
  nbins = NULL,
  cutpoint = 0,
  plot = TRUE,
  type = c("value", "number"),
  xlim = range(x, na.rm = TRUE),
  cex = 0.9,
  main = NULL,
  xlab,
  ylab,
  ...
)

Arguments

`x`	Forcing variable
`y`	Output
`h`	the bandwidth (defaults to `2sd(runvar)length(runvar)^(-.5)`)
`nbins`	number of Bins
`cutpoint`	Cutpoint
`plot`	Logical. Whether to plot or only returned silently
`type`	Whether returns the y averages, or the x frequencies
`xlim`, `cex`, `main`, `xlab`, `ylab`	Usual parameters passed to plot(), see `par`
`...`	further arguments passed to plot.

Value

Returns silently values

References

McCrary, Justin.

Draw a (density) plot of placebo tests

Description

Draw a plot of placebo tests, estimating the impact on fake cutpoints

Usage

plotPlacebo(object, device = c("ggplot", "base"), ...)

## S3 method for class 'rdd_reg'
plotPlacebo(
  object,
  device = c("ggplot", "base"),
  from = 0.25,
  to = 0.75,
  by = 0.1,
  level = 0.95,
  same_bw = FALSE,
  vcov. = NULL,
  plot = TRUE,
  output = c("data", "ggplot"),
  ...
)

plotPlaceboDens(object, device = c("ggplot", "base"), ...)

## S3 method for class 'rdd_reg'
plotPlaceboDens(
  object,
  device = c("ggplot", "base"),
  from = 0.25,
  to = 0.75,
  by = 0.1,
  level = 0.95,
  same_bw = FALSE,
  vcov. = NULL,
  ...
)

computePlacebo(
  object,
  from = 0.25,
  to = 0.75,
  by = 0.1,
  level = 0.95,
  same_bw = FALSE,
  vcov. = NULL
)
plotPlacebo(object, device = c("ggplot", "base"), ...)

## S3 method for class 'rdd_reg'
plotPlacebo(
  object,
  device = c("ggplot", "base"),
  from = 0.25,
  to = 0.75,
  by = 0.1,
  level = 0.95,
  same_bw = FALSE,
  vcov. = NULL,
  plot = TRUE,
  output = c("data", "ggplot"),
  ...
)

plotPlaceboDens(object, device = c("ggplot", "base"), ...)

## S3 method for class 'rdd_reg'
plotPlaceboDens(
  object,
  device = c("ggplot", "base"),
  from = 0.25,
  to = 0.75,
  by = 0.1,
  level = 0.95,
  same_bw = FALSE,
  vcov. = NULL,
  ...
)

computePlacebo(
  object,
  from = 0.25,
  to = 0.75,
  by = 0.1,
  level = 0.95,
  same_bw = FALSE,
  vcov. = NULL
)

Arguments

`object`	the output of an RDD regression
`device`	Whether to draw a base or a ggplot graph.
`from`	Starting point of the fake cutpoints sequence. Refers ot the quantile of each side of the true cutpoint
`to`	Ending point of the fake cutpoints sequence. Refers ot the quantile of each side of the true cutpoint
`by`	Increments of the from-to sequence
`level`	Level of the confidence interval shown
`same_bw`	Whether to re-estimate the bandwidth at each point
`vcov.`	Specific covariance function to pass to coeftest. See help of package `sandwich`.
`plot`	Whether to actually plot the data.
`output`	Whether to return (invisibly) the data frame containing the bandwidths and corresponding estimates, or the ggplot object
`...`	Further arguments passed to specific methods.

Value

A data frame containing the cutpoints, their corresponding estimates and confidence intervals.

Author(s)

Matthieu Stigler <[email protected]>

Examples

data(house)
house_rdd <- rdd_data(y=house$y, x=house$x, cutpoint=0)
reg_nonpara <- rdd_reg_np(rdd_object=house_rdd)
plotPlacebo(reg_nonpara)

# Use with another vcov function; cluster case
reg_nonpara_lminf <- rdd_reg_np(rdd_object=house_rdd, inference='lm')
# need to be a function applied to updated object!
vc <- function(x) vcovCluster(x, clusterVar=model.frame(x)$x)
plotPlacebo(reg_nonpara_lminf, vcov. = vc)
data(house)
house_rdd <- rdd_data(y=house$y, x=house$x, cutpoint=0)
reg_nonpara <- rdd_reg_np(rdd_object=house_rdd)
plotPlacebo(reg_nonpara)

# Use with another vcov function; cluster case
reg_nonpara_lminf <- rdd_reg_np(rdd_object=house_rdd, inference='lm')
# need to be a function applied to updated object!
vc <- function(x) vcovCluster(x, clusterVar=model.frame(x)$x)
plotPlacebo(reg_nonpara_lminf, vcov. = vc)

Plot the sensitivity to the bandwidth

Description

Draw a plot showing the LATE estimates depending on multiple bandwidths

Usage

plotSensi(
  rdd_regobject,
  from,
  to,
  by = 0.01,
  level = 0.95,
  output = c("data", "ggplot"),
  plot = TRUE,
  ...
)

## S3 method for class 'rdd_reg_np'
plotSensi(
  rdd_regobject,
  from,
  to,
  by = 0.05,
  level = 0.95,
  output = c("data", "ggplot"),
  plot = TRUE,
  device = c("ggplot", "base"),
  vcov. = NULL,
  ...
)

## S3 method for class 'rdd_reg_lm'
plotSensi(
  rdd_regobject,
  from,
  to,
  by = 0.05,
  level = 0.95,
  output = c("data", "ggplot"),
  plot = TRUE,
  order,
  type = c("colour", "facet"),
  ...
)
plotSensi(
  rdd_regobject,
  from,
  to,
  by = 0.01,
  level = 0.95,
  output = c("data", "ggplot"),
  plot = TRUE,
  ...
)

## S3 method for class 'rdd_reg_np'
plotSensi(
  rdd_regobject,
  from,
  to,
  by = 0.05,
  level = 0.95,
  output = c("data", "ggplot"),
  plot = TRUE,
  device = c("ggplot", "base"),
  vcov. = NULL,
  ...
)

## S3 method for class 'rdd_reg_lm'
plotSensi(
  rdd_regobject,
  from,
  to,
  by = 0.05,
  level = 0.95,
  output = c("data", "ggplot"),
  plot = TRUE,
  order,
  type = c("colour", "facet"),
  ...
)

Arguments

`rdd_regobject`	object of a RDD regression, from either `rdd_reg_lm` or `rdd_reg_np`
`from`	First bandwidth point. Default value is max(1e-3, bw-0.1)
`to`	Last bandwidth point. Default value is bw+0.1
`by`	Increments in the `from` `to` sequence
`level`	Level of the confidence interval
`output`	Whether to return (invisibly) the data frame containing the bandwidths and corresponding estimates, or the ggplot object
`plot`	Whether to actually plot the data.
`device`	Whether to draw a base or a ggplot graph.
`vcov.`	Specific covariance function to pass to coeftest. See help of package `sandwich`
`order`	For parametric models (from `rdd_reg_lm`), the order of the polynomial.
`type`	For parametric models (from `rdd_reg_lm`) whether different orders are represented as different colour or as different facets.
`...`	Further arguments passed to specific methods

Value

A data frame containing the bandwidths and corresponding estimates and confidence intervals.

Author(s)

Matthieu Stigler <[email protected]>

Examples

data(house)
house_rdd <- rdd_data(y=house$y, x=house$x, cutpoint=0)

#Non-parametric estimate 
bw_ik <- rdd_bw_ik(house_rdd)
reg_nonpara <- rdd_reg_np(rdd_object=house_rdd, bw=bw_ik)
plotSensi(reg_nonpara)
plotSensi(reg_nonpara, device='base')

#Parametric estimate:
reg_para_ik <- rdd_reg_lm(rdd_object=house_rdd, order=4, bw=bw_ik)
plotSensi(reg_para_ik)
plotSensi(reg_para_ik, type='facet')
data(house)
house_rdd <- rdd_data(y=house$y, x=house$x, cutpoint=0)

#Non-parametric estimate 
bw_ik <- rdd_bw_ik(house_rdd)
reg_nonpara <- rdd_reg_np(rdd_object=house_rdd, bw=bw_ik)
plotSensi(reg_nonpara)
plotSensi(reg_nonpara, device='base')

#Parametric estimate:
reg_para_ik <- rdd_reg_lm(rdd_object=house_rdd, order=4, bw=bw_ik)
plotSensi(reg_para_ik)
plotSensi(reg_para_ik, type='facet')

Bandwidth selection for Regression Discontinuity estimators, CTT 2014

Description

Simple wrapper of the Calonico-Cattaneo-Titiunik (2014) bandwidth selection procedures for RDD estimators rdbwselect.

Usage

rdd_bw_cct_estim(
  rdd_object,
  method = c("mserd", "msetwo", "msesum", "msecomb1", "msecomb2", "cerrd", "certwo",
    "cersum", "cercomb1"),
  kernel = c("Triangular", "Uniform", "Epanechnikov"),
  ...
)
rdd_bw_cct_estim(
  rdd_object,
  method = c("mserd", "msetwo", "msesum", "msecomb1", "msecomb2", "cerrd", "certwo",
    "cersum", "cercomb1"),
  kernel = c("Triangular", "Uniform", "Epanechnikov"),
  ...
)

Arguments

`rdd_object`	of class rdd_data created by `rdd_data`
`method`	The type of method used. See
`kernel`	The type of kernel used: either `Triangular`, `Uniform` or `Epanechnikov`.
`...`	further arguments passed to `rdbwselect`.

Value

See documentation of rdbwselect

Author(s)

Original code written by Calonico, Cattaneo, Farrell and Titiuni, see rdbwselect

References

Calonico, S., M. D. Cattaneo, and R. Titiunik. 2014a. Robust Nonparametric Confidence Intervals for Regression-Discontinuity Designs. Econometrica 82(6): 2295-2326. https://www.tandfonline.com/doi/abs/10.1080/01621459.2015.1017578.

Examples

data(house)
rd<- rdd_data(x=house$x, y=house$y, cutpoint=0)
rdd_bw_cct_estim(rd)

data(house)
rd<- rdd_data(x=house$x, y=house$y, cutpoint=0)
rdd_bw_cct_estim(rd)

Bandwidth selection for Regression Discontinuity visualisation, CTT 2015

Description

Simple wrapper of the Calonico-Cattaneo-Titiunik (2015) bandwidth selection procedures for RDD visualisation rdplot.

Usage

rdd_bw_cct_plot(
  rdd_object,
  method = c("esmv", "es", "espr", "esmvpr", "qs", "qspr", "qsmv", "qsmvpr"),
  ...
)
rdd_bw_cct_plot(
  rdd_object,
  method = c("esmv", "es", "espr", "esmvpr", "qs", "qspr", "qsmv", "qsmvpr"),
  ...
)

Arguments

`rdd_object`	of class rdd_data created by `rdd_data`
`method`	The type of method used. See `rdplot`. Default is `esmv`, the variance mimicking evenly-spaced method.
`...`	further arguments passed to `rdplot`.

Value

See documentation of rdplot

Author(s)

Original code written by Calonico, Cattaneo, Farrell and Titiuni, see rdplot

References

Calonico, S., M. D. Cattaneo, and R. Titiunik. 2015a. Optimal Data-Driven Regression Discontinuity Plots. Journal of the American Statistical Association 110(512): 1753-1769. https://www.tandfonline.com/doi/abs/10.1080/01621459.2015.1017578.

Examples

data(house)
rd<- rdd_data(x=house$x, y=house$y, cutpoint=0)
rdd_bw_cct_plot(rd)

data(house)
rd<- rdd_data(x=house$x, y=house$y, cutpoint=0)
rdd_bw_cct_plot(rd)

Imbens-Kalyanaraman Optimal Bandwidth Calculation

Description

Imbens-Kalyanaraman optimal bandwidth for local linear regression in Regression discontinuity designs.

Usage

rdd_bw_ik(rdd_object, kernel = c("Triangular", "Uniform", "Normal"))
rdd_bw_ik(rdd_object, kernel = c("Triangular", "Uniform", "Normal"))

Arguments

`rdd_object`	of class rdd_data created by `rdd_data`
`kernel`	The type of kernel used: either `triangular` or `uniform`.

Value

The optimal bandwidth

Author(s)

Matthieu Stigler <[email protected]>

References

Imbens, Guido and Karthik Kalyanaraman. (2012) 'Optimal Bandwidth Choice for the regression discontinuity estimator,' Review of Economic Studies (2012) 79, 933-959

Examples

data(house)
rd<- rdd_data(x=house$x, y=house$y, cutpoint=0)
rdd_bw_ik(rd)
data(house)
rd<- rdd_data(x=house$x, y=house$y, cutpoint=0)
rdd_bw_ik(rd)

Global bandwidth selector of Ruppert, Sheather and Wand (1995) from package KernSmooth

Description

Uses the global bandwidth selector of Ruppert, Sheather and Wand (1995) either to the whole function, or to the functions below and above the cutpoint.

Usage

rdd_bw_rsw(object, type = c("global", "sided"))
rdd_bw_rsw(object, type = c("global", "sided"))

Arguments

`object`	object of class rdd_data created by `rdd_data`
`type`	Whether to choose a global bandwidth for the whole function (`global`) or for each side (`sided`)

Value

One (or two for sided) bandwidth value.

References

See dpill

Examples

data(house)
rd<- rdd_data(x=house$x, y=house$y, cutpoint=0)
rdd_bw_rsw(rd)
data(house)
rd<- rdd_data(x=house$x, y=house$y, cutpoint=0)
rdd_bw_rsw(rd)

RDD coefficient

Description

Function to access the RDD coefficient in the various regressions

Usage

rdd_coef(object, allInfo = FALSE, allCo = FALSE, ...)

## Default S3 method:
rdd_coef(object, allInfo = FALSE, allCo = FALSE, ...)

## S3 method for class 'rdd_reg_np'
rdd_coef(object, allInfo = FALSE, allCo = FALSE, ...)
rdd_coef(object, allInfo = FALSE, allCo = FALSE, ...)

## Default S3 method:
rdd_coef(object, allInfo = FALSE, allCo = FALSE, ...)

## S3 method for class 'rdd_reg_np'
rdd_coef(object, allInfo = FALSE, allCo = FALSE, ...)

Arguments

`object`	A RDD regression object
`allInfo`	whether to return just the coefficients (allInfo=FALSE) or also the se/t stat/pval.
`allCo`	Whether to give only the RDD coefficient (allCo=FALSE) or all coefficients
`...`	Further arguments passed to/from specific methods

Value

Either a numeric value of the RDD coefficient estimate, or a data frame with the estimate, its standard value, t test and p-value and

Construct rdd_data

Description

Construct the base RDD object, containing x, y and the cutpoint, eventuallay covariates.

Usage

rdd_data(y, x, covar, cutpoint, z, labels, data)
rdd_data(y, x, covar, cutpoint, z, labels, data)

Arguments

`y`	Output
`x`	Forcing variable
`covar`	Exogeneous variables
`cutpoint`	Cutpoint
`z`	Assignment variable for the fuzzy case. Should be 0/1 or TRUE/FALSE variable.
`labels`	Additional labels to provide as list (with entries `x`, `y`, and eventually vector `covar`). Unused currently.
`data`	A data-frame for the `x` and `y` variables. If this is provided, the column names can be entered directly for argument `x`, `y` and `covar`. For `covar`, should be a character vector.

Details

Arguments x, y (and eventually covar) can be either given as:

vectors (eventually data-frame for covar)
quote/character when data is also provided. For multiple covar, use a vector of characters

Value

Object of class rdd_data, inheriting from data.frame

Author(s)

Matthieu Stigler [email protected]

Examples

data(house)
rd <- rdd_data(x=house$x, y=house$y, cutpoint=0)
rd2 <- rdd_data(x=x, y=y, data=house, cutpoint=0)

# The print() function is the same as the print.data.frame:
rd

# The summary() and plot() function are specific to rdd_data
summary(rd)
plot(rd)

# for the fuzzy case, you need to specify the assignment variable z:
rd_dat_fakefuzzy <- rdd_data(x=house$x, y=house$y, 
                             z=ifelse(house$x>0+rnorm(nrow(house), sd=0.05),1,0), 
                             cutpoint=0)
summary(rd_dat_fakefuzzy)
data(house)
rd <- rdd_data(x=house$x, y=house$y, cutpoint=0)
rd2 <- rdd_data(x=x, y=y, data=house, cutpoint=0)

# The print() function is the same as the print.data.frame:
rd

# The summary() and plot() function are specific to rdd_data
summary(rd)
plot(rd)

# for the fuzzy case, you need to specify the assignment variable z:
rd_dat_fakefuzzy <- rdd_data(x=house$x, y=house$y, 
                             z=ifelse(house$x>0+rnorm(nrow(house), sd=0.05),1,0), 
                             cutpoint=0)
summary(rd_dat_fakefuzzy)

General polynomial estimator of the regression discontinuity

Description

Compute RDD estimate allowing a locally kernel weighted version of any estimation function possibly on the range specified by bandwidth

Usage

rdd_gen_reg(
  rdd_object,
  fun = glm,
  covariates = NULL,
  order = 1,
  bw = NULL,
  slope = c("separate", "same"),
  covar.opt = list(strategy = c("include", "residual"), slope = c("same", "separate"), bw
    = NULL),
  weights,
  ...
)
rdd_gen_reg(
  rdd_object,
  fun = glm,
  covariates = NULL,
  order = 1,
  bw = NULL,
  slope = c("separate", "same"),
  covar.opt = list(strategy = c("include", "residual"), slope = c("same", "separate"), bw
    = NULL),
  weights,
  ...
)

Arguments

`rdd_object`	Object of class rdd_data created by `rdd_data`
`fun`	The function to estimate the parameters
`covariates`	Formula to include covariates
`order`	Order of the polynomial regression.
`bw`	A bandwidth to specify the subset on which the kernel weighted regression is estimated
`slope`	Whether slopes should be different on left or right (separate), or the same.
`covar.opt`	Options for the inclusion of covariates. Way to include covariates, either in the main regression (`include`) or as regressors of y in a first step (`residual`).
`weights`	Optional weights to pass to the lm function. Note this cannot be entered together with `bw`
`...`	Further arguments passed to fun. See the example.

Details

This function allows the user to use a custom estimating function, instead of the traditional lm(). It is assumed that the custom funciton has following behaviour:

A formula interface, together with a data argument
A weight argument
A coef(summary(x)) returning a data-frame containing a column Estimate

Note that for the last requirement, this can be accomodated by writing a specific rdd_coef function for the class of the object returned by fun.

Value

An object of class rdd_reg_lm and class lm, with specific print and plot methods

References

TODO

Examples

## Step 0: prepare data
data(house)
house_rdd <- rdd_data(y=house$y, x=house$x, cutpoint=0)

## Estimate a local probit:
house_rdd$y <- with(house_rdd, ifelse(y<quantile(y, 0.25), 0,1))
reg_bin_glm <- rdd_gen_reg(rdd_object=house_rdd, fun= glm, family=binomial(link='probit'))
print(reg_bin_glm)
summary(reg_bin_glm)

## Step 0: prepare data
data(house)
house_rdd <- rdd_data(y=house$y, x=house$x, cutpoint=0)

## Estimate a local probit:
house_rdd$y <- with(house_rdd, ifelse(y<quantile(y, 0.25), 0,1))
reg_bin_glm <- rdd_gen_reg(rdd_object=house_rdd, fun= glm, family=binomial(link='probit'))
print(reg_bin_glm)
summary(reg_bin_glm)

RDD coefficient prediction

Description

Function to predict the RDD coefficient in presence of covariate (without covariates, returns the same than rdd_coef)

Usage

rdd_pred(
  object,
  covdata,
  se.fit = TRUE,
  vcov. = NULL,
  newdata,
  stat = c("identity", "sum", "mean"),
  weights
)
rdd_pred(
  object,
  covdata,
  se.fit = TRUE,
  vcov. = NULL,
  newdata,
  stat = c("identity", "sum", "mean"),
  weights
)

Arguments

`object`	A RDD regression object
`covdata`	New data.frame specifying the values of the covariates, can have multiple rows.
`se.fit`	A switch indicating if standard errors are required.
`vcov.`	Specific covariance function (see package sandwich ), by default uses the `vcov`
`newdata`	Another data on which to evaluate the x/D variables. Useful in very few cases.
`stat`	The statistic to use if there are multiple predictions, 'identity' just returns the single values, 'mean' averages them
`weights`	Eventual weights for the averaging of the predicted values.

Details

The function rdd_pred does a simple prediction of the RDD effect

$RDDeffect= \mu(x, z, D=1) - \mu(x, z, D=0)$

When there are no covariates (and z is irrelevant in the equation above), this amounts exactly to the usual RDD coefficient, shown in the outputs, or obtained with rdd_coef. If there were covariates, and if these covariates were estimated using the “include” strategy and with different coefficients left and right to the cutoff (i.e. had argument slope = “separate”), than the RDD effect is also dependent on the value of the covariate(s). rdd_pred allows to set the value of the covariate(s) at which to evaluate the RDD effect, by providing a data.frame with the values for the covariates. Note that the effect can be evaluated at multiple points, if you provide multiple rows of covdata.

In pressence of covariate-specific RDD effect, one may wish to estimate an average effect. This can be done by setting the argument stat='mean'. Weights can additionally be added, with the argument weights, to obtain a weighted-average of the predictions. Note however that in most cases, this will be equivalent to provide covariates at their (weighted) mean value, which will be much faster also!

Standard errors, obtained setting the argument se.fit=TRUE, are computed using following formula:

$x_i \Omega x_i^{'}$

where $\Omega$ is the estimated variance-covariance matrix ( by default $\sigma^2(X^{'}X)^{-1}$ using vcov) and $x_i$ is the input data (a mix of covdata and input data). If one wishes individual predictions, standard errors are simply obtained as the square of that diagonal matrix, whereas for mean/sum, covariances are taken into account.

Value

Returns the predicted value(s), and, if se.fit=TRUE, their standard errors.

References

Froehlich (2007) Regression discontinuity design with covariates, IZA discussion paper 3024

Examples

# Load data, add (artificial) covariates:
data(house)
n_Lee <- nrow(house)
z1 <- runif(n_Lee)
house_rdd <- rdd_data(y=y, x=x, data=house, covar=z1, cutpoint=0)

# estimation without covariates: rdd_pred is the same than rdd_coef:
reg_para <- rdd_reg_lm(rdd_object=house_rdd)

rdd_pred(reg_para)
rdd_coef(reg_para, allInfo=TRUE)

# estimation with covariates: 
reg_para_cov <- rdd_reg_lm(rdd_object=house_rdd,
                          covariates='z1',
                          covar.opt=list(slope='separate') )

# should obtain same result as with RDestimate                             
rdd_pred(reg_para_cov, covdata=data.frame(z1=0)) 
  
# evaluate at mean of z1 (as comes from uniform)
rdd_pred(reg_para_cov, covdata=data.frame(z1=0.5))
# Load data, add (artificial) covariates:
data(house)
n_Lee <- nrow(house)
z1 <- runif(n_Lee)
house_rdd <- rdd_data(y=y, x=x, data=house, covar=z1, cutpoint=0)

# estimation without covariates: rdd_pred is the same than rdd_coef:
reg_para <- rdd_reg_lm(rdd_object=house_rdd)

rdd_pred(reg_para)
rdd_coef(reg_para, allInfo=TRUE)

# estimation with covariates: 
reg_para_cov <- rdd_reg_lm(rdd_object=house_rdd,
                          covariates='z1',
                          covar.opt=list(slope='separate') )

# should obtain same result as with RDestimate                             
rdd_pred(reg_para_cov, covdata=data.frame(z1=0)) 
  
# evaluate at mean of z1 (as comes from uniform)
rdd_pred(reg_para_cov, covdata=data.frame(z1=0.5))

Parametric polynomial estimator of the regression discontinuity

Description

Compute a parametric polynomial regression of the ATE, possibly on the range specified by bandwidth

Usage

rdd_reg_lm(
  rdd_object,
  covariates = NULL,
  order = 1,
  bw = NULL,
  slope = c("separate", "same"),
  covar.opt = list(strategy = c("include", "residual"), slope = c("same", "separate"), bw
    = NULL),
  covar.strat = c("include", "residual"),
  weights
)
rdd_reg_lm(
  rdd_object,
  covariates = NULL,
  order = 1,
  bw = NULL,
  slope = c("separate", "same"),
  covar.opt = list(strategy = c("include", "residual"), slope = c("same", "separate"), bw
    = NULL),
  covar.strat = c("include", "residual"),
  weights
)

Arguments

`rdd_object`	Object of class rdd_data created by `rdd_data`
`covariates`	Formula to include covariates
`order`	Order of the polynomial regression.
`bw`	A bandwidth to specify the subset on which the parametric regression is estimated
`slope`	Whether slopes should be different on left or right (separate), or the same.
`covar.opt`	Options for the inclusion of covariates. Way to include covariates, either in the main regression (`include`) or as regressors of y in a first step (`residual`).
`covar.strat`	DEPRECATED, use covar.opt instead.
`weights`	Optional weights to pass to the lm function. Note this cannot be entered together with `bw`

Details

This function estimates the standard discontinuity regression:

$Y=\alpha+\tau D+\beta_{1}(X-c)+\beta_{2}D(X-c)+\epsilon$

with $\tau$ the main parameter of interest. Several versions of the regression can be estimated, either restricting the slopes to be the same, i.e $\beta_{1}=\beta_{2}$ (argument slope). The order of the polynomial in $X-c$ can also be adjusted with argument order. Note that a value of zero can be used, which corresponds to the simple difference in means, that one would use if the samples were random. Covariates can also be added in the regression, according to the two strategies discussed in Lee and Lemieux (2010, sec 4.5), through argument covar.strat:

include: Covariates are simply added as supplementary regressors in the RD equation
residual: The dependent variable is first regressed on the covariates only, then the RDD equation is applied on the residuals from this first step

The regression can also be estimated in a neighborhood of the cutpoint with the argument bw. This make the parametric regression resemble the non-parametric local kernel rdd_reg_np. Similarly, weights can also be provided (but not simultaneously to bw).

The returned object is a classical lm object, augmented with a RDDslot, so usual methods can be applied. As is done in general in R, heteroskeadsticity-robust inference can be done later on with the usual function from package sandwich. For the case of clustered observations a specific function clusterInf is provided.

Value

An object of class rdd_reg_lm and class lm, with specific print and plot methods

Examples

## Step 0: prepare data
data(house)
house_rdd <- rdd_data(y=house$y, x=house$x, cutpoint=0)
## Step 2: regression
# Simple polynomial of order 1:
reg_para <- rdd_reg_lm(rdd_object=house_rdd)
print(reg_para)
plot(reg_para)

# Simple polynomial of order 4:
reg_para4 <- rdd_reg_lm(rdd_object=house_rdd, order=4)
reg_para4
plot(reg_para4)

# Restrict sample to bandwidth area:
bw_ik <- rdd_bw_ik(house_rdd)
reg_para_ik <- rdd_reg_lm(rdd_object=house_rdd, bw=bw_ik, order=4)
reg_para_ik
plot(reg_para_ik)
## Step 0: prepare data
data(house)
house_rdd <- rdd_data(y=house$y, x=house$x, cutpoint=0)
## Step 2: regression
# Simple polynomial of order 1:
reg_para <- rdd_reg_lm(rdd_object=house_rdd)
print(reg_para)
plot(reg_para)

# Simple polynomial of order 4:
reg_para4 <- rdd_reg_lm(rdd_object=house_rdd, order=4)
reg_para4
plot(reg_para4)

# Restrict sample to bandwidth area:
bw_ik <- rdd_bw_ik(house_rdd)
reg_para_ik <- rdd_reg_lm(rdd_object=house_rdd, bw=bw_ik, order=4)
reg_para_ik
plot(reg_para_ik)

Parametric polynomial estimator of the regression discontinuity

Description

Compute a parametric polynomial regression of the ATE, possibly on the range specified by bandwidth

Usage

rdd_reg_np(
  rdd_object,
  covariates = NULL,
  bw = rdd_bw_ik(rdd_object),
  slope = c("separate", "same"),
  inference = c("np", "lm"),
  covar.opt = list(slope = c("same", "separate"), bw = NULL)
)
rdd_reg_np(
  rdd_object,
  covariates = NULL,
  bw = rdd_bw_ik(rdd_object),
  slope = c("separate", "same"),
  inference = c("np", "lm"),
  covar.opt = list(slope = c("same", "separate"), bw = NULL)
)

Arguments

`rdd_object`	Object of class rdd_data created by `rdd_data`
`covariates`	TODO
`bw`	A bandwidth to specify the subset on which the parametric regression is estimated
`slope`	Whether slopes should be different on left or right (separate), or the same.
`inference`	Type of inference to conduct: non-parametric one (`np`) or standard (`lm`). See details.
`covar.opt`	Options for the inclusion of covariates. Way to include covariates, either in the main regression (`include`) or as regressors of y in a first step (`residual`).

Value

An object of class rdd_reg_np and class lm, with specific print and plot methods

References

TODO

Examples

## Step 0: prepare data
data(house)
house_rdd <- rdd_data(y=house$y, x=house$x, cutpoint=0)
## Step 2: regression
# Simple polynomial of order 1:
reg_nonpara <- rdd_reg_np(rdd_object=house_rdd)
print(reg_nonpara)
plot(reg_nonpara)
## Step 0: prepare data
data(house)
house_rdd <- rdd_data(y=house$y, x=house$x, cutpoint=0)
## Step 2: regression
# Simple polynomial of order 1:
reg_nonpara <- rdd_reg_np(rdd_object=house_rdd)
print(reg_nonpara)
plot(reg_nonpara)

Regression Discontinuity Design

Description

Set of functions for Regression Discontinuity Design ('RDD'), for data visualisation, estimation and testing.

Bandwidth selector

Description

implements dpill

Usage

rot_bw(object)
rot_bw(object)

Arguments

object

object of class rdd_data

References

McCrary, Justin. (2008) 'Manipulation of the running variable in the regression discontinuity design: A density test,' Journal of Econometrics. 142(2): 698-714.

Examples

#No discontinuity
#No discontinuity

Transformation of the STAR dataset as used in Angrist and Pischke (2008)

Description

Transformation of the STAR dataset as used in Table 8.2.1 of Angrist and Pischke (2008)

Format

A data frame containing 5743 observations and 6 variables. The first variable is from the original dataset, all other are created by Angrist and Pischke STAT code.

schidkn: School ID in kindergarden (original variable, schoolidk in STAR)
pscore: The propensity score (computed by A & P)
classid: The id of the class (computed by A & P)
cs: Class size (computed by A & P)
female, nwhite: Various covariates (computed by A & P)

Details

). This is a transformation of the dataset from the project STAR (Student/Teacher Achievement Ratio. The full dataset is described and available in package AER, STAR. The transformed data was obtained using the STATA script krueger.do, obtained from Joshua Angrist website, on the webstar.dta.

Source

Data obtained using the script krueger.do on data webstar.rda, found on J. Angrist website

References

Krueger, A. (1999) 'Experimental Estimates Of Education Production Functions,' The Quarterly Journal of Economics, Vol. 114(2), pages 497-532, May.

Angrist, A. ad Pischke J-S (2008) Mostly Harmless Econometrics: An Empiricist's Companion, Princeton University press

Examples

data(STAR_MHE)

# Compute the group means:
STAR_MHE_means <- aggregate(STAR_MHE[, c('classid', 'pscore', 'cs')],
                            by=list(STAR_MHE$classid), mean)

# Regression of means, with weighted average:
reg_krug_gls <- lm(pscore~cs, data=STAR_MHE_means, weights=cs)
coef(summary(reg_krug_gls))[2,2]
data(STAR_MHE)

# Compute the group means:
STAR_MHE_means <- aggregate(STAR_MHE[, c('classid', 'pscore', 'cs')],
                            by=list(STAR_MHE$classid), mean)

# Regression of means, with weighted average:
reg_krug_gls <- lm(pscore~cs, data=STAR_MHE_means, weights=cs)
coef(summary(reg_krug_gls))[2,2]

Cluster Heteroskedasticity-consistent estimation of the covariance matrix.

Description

Offer a cluster variant of the usual Heteroskedasticity-consistent

Usage

vcovCluster(object, clusterVar)

vcovCluster2(object, clusterVar1, clusterVar2)
vcovCluster(object, clusterVar)

vcovCluster2(object, clusterVar1, clusterVar2)

Arguments

`object`	Object of class lm, from which rdd_reg also inherits.
`clusterVar`	The variable containing the cluster attributions.
`clusterVar1`, `clusterVar2`	The two cluster variables for the 2-cluster case.

Value

A matrix containing the covariance matrix estimate.

Author(s)

Mahmood Arai, see http://people.su.se/~ma/econometrics.html

References

Cameron, C., Gelbach, J. and Miller, D. (2011) Robust Inference With Multiway Clustering, Journal of Business and Economic Statistics, vol. 29(2), pages 238-249. #' @references Wooldridge (2003) Cluster-sample methods in applied econometrics. American Economic Review, 93, p. 133-138

Arai, M. (2011) Cluster-robust standard errors using R, Note available http://people.su.se/~ma/clustering.pdf.

Examples

data(STAR_MHE)
if(all(c(require(sandwich), require(lmtest)))){

# Run simple regression:
reg_krug <- lm(pscore~cs, data=STAR_MHE)

# Row 1 of Table 8.2.1, inference with standard vcovHC:
coeftest(reg_krug,vcov.=vcovHC(reg_krug, 'HC1'))[2,2]

# Row 4 of Table 8.2.1, inference with cluster vcovHC:
coeftest(reg_krug,vcov.=vcovCluster(reg_krug, clusterVar=STAR_MHE$classid))[2,2]
}
data(STAR_MHE)
if(all(c(require(sandwich), require(lmtest)))){

# Run simple regression:
reg_krug <- lm(pscore~cs, data=STAR_MHE)

# Row 1 of Table 8.2.1, inference with standard vcovHC:
coeftest(reg_krug,vcov.=vcovHC(reg_krug, 'HC1'))[2,2]

# Row 4 of Table 8.2.1, inference with cluster vcovHC:
coeftest(reg_krug,vcov.=vcovCluster(reg_krug, clusterVar=STAR_MHE$classid))[2,2]
}

Confint allowing vcov

Description

Version of vcov allowing for confint

Usage

waldci(x, parm = NULL, level = 0.95, vcov. = NULL, df = NULL, ...)
waldci(x, parm = NULL, level = 0.95, vcov. = NULL, df = NULL, ...)

Arguments

`x`	Object of class lm or else
`parm`	specification of which parameters are to be given confidence intervals, see confint
`level`	the confidence level required, see confint()
`vcov.`	Specific covariance function to pass to coeftest. See help of sandwich
`df`	Degrees of freedom
`...`	Further arguments

Package 'rddtools'

Help Index

Convert a rdd object to lm

Description

Usage

Arguments

Value

See Also

Examples

Convert an rdd_reg object to a npreg object

Description

Usage

Arguments

Details

Value

See Also

Examples

Post-inference for clustered data

Description

Usage

Arguments

Value

References

See Also

Examples

Testing for balanced covariates: equality of distribution

Description

Usage

Arguments

Value

Author(s)

See Also

Examples

Testing for balanced covariates: equality of means with t-test

Description

Usage

Arguments

Value

Author(s)

See Also

Examples

Run the McCracy test for manipulation of the forcing variable

Description

Usage

Arguments

Examples

Generate Monte Carlo simulations of Imbens and Kalyanaraman

Description

Usage

Arguments

Value

Examples

Dataset used in Lee (2008)

Description

Format

Source

References

Examples

INDH data set

Description

Format

References

Examples

Plot rdd_data

Description

Usage

Arguments

Details

Value

Author(s)

Examples

Bin plotting

Description

Usage

Arguments

Value

References

Draw a (density) plot of placebo tests

Description

Usage

Convert an rdd_reg object to a `npreg` object