Simulates data from a linear instrumental variables model with many instruments, specifically for design where instruments are mutually exclusive group indicators (e.g., Judges). The function allows for treatment effect heterogeneity (random coefficients) and correlation between heterogeneity and selection.

GenData_nocov(
  S = 3,
  Het = 3,
  sigee = 1,
  sigvv = 1,
  sigexi = 0,
  sigev = 0,
  ConTE = FALSE,
  beta = 0,
  beta0 = 0,
  K = 40,
  c = 5
)

Arguments

S

Numeric. Concentration parameter \(\mu^2\), determining instrument strength.

Het

Numeric. Heterogeneity parameter, scaling the variance of the random slope \(\xi\).

sigee

Numeric. Variance of the structural error \(\varepsilon\).

sigvv

Numeric. Variance of the first-stage error \(v\).

sigexi

Numeric. Covariance between structural error \(\varepsilon\) and heterogeneity \(\xi\).

sigev

Numeric. Covariance between structural error \(\varepsilon\) and first-stage error \(v\) (Endogeneity).

ConTE

Logical. If TRUE, simulates Constant Treatment Effects (\(\xi = 0\)).

beta

Numeric. The true average treatment effect (ATE).

beta0

Numeric. The null hypothesis value for \(\beta\) (used to compute residuals \(e\)).

K

Integer. The number of instruments (groups) minus one. Total groups \(J = K + 1\).

c

Integer. The number of observations per group (balanced block size).

Value

A data frame containing:

group

Group identifier (1 to \(J\)).

pi

True instrument mean for the observation.

eps

Structural error \(\varepsilon\).

xi

Heterogeneity term \(\xi\).

v

First-stage error \(v\).

X

Endogenous regressor.

Y

Outcome variable.

e

Residual under the null (\(Y - X\beta_0\)).

MX

Leverage-adjusted regressor \(M X\).

Me

Leverage-adjusted residual \(M e\).

Details

The data generating process (DGP) is: $$X_i = \pi_{g(i)} + v_i$$ $$Y_i = X_i (\beta + \xi_i) + \varepsilon_i$$

The instrument coefficients \(\pi\) are drawn from a balanced design taking values \(\{-s, 0, s\}\).

If ConTE = FALSE, the error vector \((\varepsilon, \xi, v)\) is drawn from a trivariate normal distribution. The covariance structure varies across blocks to introduce correlation between the instrument strength/direction and the correlation of \(v\) and \(\xi\).