Exercise FOL2 Probit model of union membership of females, Stewart (2006)

This exercise uses the union data for US young
women from the National Longitudinal Survey of Youth (NLSY) from the Stata manual
(http://www.stata-press.com/data/r9/union.dta). We use the same subsample that was used by Stewart (2006) to illustrate his
Stata program (redprob). To
form this subsample Stewart (2006) uses only data
from 1978 onwards; the data for 1983 are dropped, and only those individuals
observed in each of the remaining 6 waves are kept. This gave a balanced panel
with N = 799 individuals observed in each of 6 waves. The observations for 1985
and 1987 are implicitly treated as if they were for 1984 and 1986 respectively,
which would give 6 waves at regular 2-year intervals. Trade union membership is
determined by the question of whether of not the sampled individual had her
wage set in a collective bargaining agreement or not.

References

Stewart, M.B., (2006), -redprob-
A Stata program for the Heckman estimator of the
random effects dynamic probit model,
http://www2.warwick.ac.uk/fac/soc/economics/staff/faculty/stewart/stata/redprobnote.pdf

Conditional analysis

Data description
(unionred1.dat)

Number of observations = 3995

Number of cases = 799

The variables include the following:

idcode=NLSY subject identifier code

year=interview
year

age=age
in current year

grade=years
of schooling completed

not_smsa=1
if living outside a standard metropolitan statistical area, 0 otherwise

south=1
if south, 0 otherwise

union=1
if wage is collectively negotiated, 0 otherwise

t0= year-70

southXt=1 if resident in south, 0 otherwise

black=1
if race black, 0 otherwise

tper=panel wave

lagunion=the value of union in the previous interval

d=2 for all responses, as all responses are post
baseline.

d1=0 for all responses, as all responses are post
baseline

d2=1 for all responses, as all responses are post
baseline

baseunion=1 if union=1 in 1978, 0 otherwise

The first few rows and columns of unionred1.dat
look like

Exercise

1)
Estimate a heterogeneous probit
(level-2 with idcode, mass 24) model of trade union
membership (union), with a constant and the lagged union membership variable (lagunion), age, grade, and southXt
regressors.

2)
Add the initial condition of trade union membership
in 1978 (baseunion) to the previous model. How does the inference on the lagged response (lagunion) and the scale effects differ between the two
models.

Joint analysis of the
initial condition and subsequent responses

Data description (unionred.dat)

Number of observations = 4794

Number of cases = 799

The variables are the same as unionred1.dat except
that this time the variables

d, d1 and d2
take more values.

d=1, for the initial response, 2 if a subsequent
response

d1=1 if d=1, 0 otherwise

d2=1 if d=2, 0 otherwise

The first few rows and columns of unionred.dat look like

Exercise

1)
Estimate a common random effect common scale joint probit model (mass 24) of trade union membership (union).
Use constants in both linear predictors. Use the d1 and d2 dummy variables to
set up the linear predictors. For the initial response use the regressors: age, grade, southXt
and not_smsa. For the subsequent response use the regressors: lagged union membership variable (lagunion), age, grade, southXt.
What does this model suggest about state dependence and unobserved
heterogeneity?

2)
Re-estimate the model allowing the scale parameters
for the initial and subsequent responses to be different. Is this a significant
improvement over the common scale parameter model?

3)
To the different scale model add the initial or
baseline response (baseunion). Does this make a
significant improvement to the model?