Example L9 Bivariate (binary and continuous) response model

We analyses a version of the NLSY data as used in various Stata Manuals (to illustrate the xt commands). The data is for young women who were aged 14-26 in 1968. The women were surveyed each year from 1970 to 1988, except for 1974, 1976, 1979, 1981, 1984 and 1986. We have removed records with missing values on one or more of the response and explanatory variables we want use in our analysis of the joint determinants of wages and trade union membership. There are 4132 women (idcode) with between 1 and 12 years of observation on wages being in employment (i.e. not in full time education) and earning more than \$1/hour but less than \$700/hour.

Reference

Stata Longitudinal/Panel Data, Reference Manual, Release 9, (2005), Stata Press, StataCorp LP, College Station, Texas.

Data description

Number of observations (rows): 37990

Number of variables (columns): 25

The subset of variables we use are:

ln_wage=ln(wage/GNP deflator) in a particular year are:

black=1 if woman is black, 0 otherwise;

msp=1 if woman is married and spouse is present, 0 otherwise;

grade= years of schooling completed; (0-18);

not_smsa=1 if woman was living outside a standard metropolitan statistical area (smsa), 0 otherwise;

south=1 if the woman was living in the South, 0 otherwise;

union=1 if a member of a trade union, 0 otherwise;

tenure= job tenure in years (0-26).

age= respondents age

age2 = age* age

The first few lines of union-wage.dat look like

Sabre commands

out union-wage.log

trace union-wage.trace

data ij r idcode year birth_yr age race msp nev_mar grade collgrad &

not_smsa c_city south union ttl_exp tenure ln_wage black age2 &

ttl_exp2 tenure2 y r1 r2

case idcode

yvar y

model b

rvar r

corr y

family second=g

constant first=r1 second=r2

trans r1_age r1 * age

trans r1_age2 r1 * age2

trans r1_black r1 * black

trans r1_msp r1 * msp

trans r1_not_smsa r1 * not_smsa

trans r1_south r1 * south

trans r2_black r2 * black

trans r2_msp r2 * msp

trans r2_not_smsa r2 * not_smsa

trans r2_south r2 * south

trans r2_union r2 * union

trans r2_tenure r2 * tenure

nvar first=8

lfit r1_age r1_age2 r1_black r1_msp r1_grade r1_not_smsa r1_south r1 &

r2_black r2_msp r2_grade r2_not_smsa r2_south r2_union r2_tenure r2

dis m

dis e

mass second=64

fit r1_age r1_age2 r1_black r1_msp r1_grade r1_not_smsa r1_south r1 &

r2_black r2_msp r2_grade r2_not_smsa r2_south r2_union r2_tenure r2

dis m

dis e

stop

Sabre log file

<S> trace union-wage.trace

<S> data ij r idcode year birth_yr age race msp nev_mar grade collgrad &

<S>      not_smsa c_city south union ttl_exp tenure ln_wage black age2 &

<S>      ttl_exp2 tenure2 y r1 r2

37990 observations in dataset

<S> case idcode

<S> yvar y

<S> model b

<S> rvar r

<S> corr y

<S> family second=g

<S> constant first=r1 second=r2

<S> trans r1_age r1 * age

<S> trans r1_age2 r1 * age2

<S> trans r1_black r1 * black

<S> trans r1_msp r1 * msp

<S> trans r1_not_smsa r1 * not_smsa

<S> trans r1_south r1 * south

<S> trans r2_black r2 * black

<S> trans r2_msp r2 * msp

<S> trans r2_not_smsa r2 * not_smsa

<S> trans r2_south r2 * south

<S> trans r2_union r2 * union

<S> trans r2_tenure r2 * tenure

<S> nvar first=8

<S> lfit r1_age r1_age2 r1_black r1_msp r1_grade r1_not_smsa r1_south r1 &

<S>      r2_black r2_msp r2_grade r2_not_smsa r2_south r2_union r2_tenure r2

Iteration       Log. lik.       Difference

__________________________________________

1          -21496.054

2          -18323.167        3173.

3          -18265.782        57.39

4          -18265.484       0.2979

5          -18265.484       0.4760E-04

<S> dis m

X-vars            Y-var

______________________________

r1                y

r1_age

r1_age2

r1_black

r1_msp

r1_not_smsa

r1_south

r2

r2_black

r2_msp

r2_not_smsa

r2_south

r2_union

r2_tenure

Correlated bivariate model

Standard probit/linear

Number of observations             =   37990

X-var df           =    16

Log likelihood =     -18265.484     on   37974 residual degrees of freedom

<S> dis e

Parameter              Estimate         Std. Err.

___________________________________________________

r1                     -1.3430          0.23760

r1_age                 0.12788E-01      0.15521E-01

r1_age2               -0.10605E-03      0.24659E-03

r1_black               0.48206          0.24334E-01

r1_msp                -0.20822E-01      0.21552E-01

r1_not_smsa           -0.75475E-01      0.24045E-01

r1_south              -0.49752          0.23085E-01

r2                     0.82027          0.16614E-01

r2_black              -0.10093          0.66150E-02

r2_msp                 0.50526E-03      0.57363E-02

r2_not_smsa           -0.18494          0.62495E-02

r2_south              -0.80056E-01      0.59837E-02

r2_union               0.13725          0.66379E-02

r2_tenure              0.32222E-01      0.67368E-03

sigma2                 0.37523

<S> mass second=64

<S> fit r1_age r1_age2 r1_black r1_msp r1_grade r1_not_smsa r1_south r1 &

<S>     r2_black r2_msp r2_grade r2_not_smsa r2_south r2_union r2_tenure r2

Initial Homogeneous Fit:

Iteration       Log. lik.       Difference

__________________________________________

1          -21496.054

2          -18323.167        3173.

3          -18265.782        57.39

4          -18265.484       0.2979

5          -18265.484       0.4760E-04

Iteration       Log. lik.         Step      End-points     Orthogonality

length    0          1      criterion

________________________________________________________________________

1          -17409.734        1.0000    fixed  fixed       1400.5

2          -16643.572        1.0000    fixed  fixed       704.08

3          -15692.838        1.0000    fixed  fixed       2748.0

4          -13589.148        0.5000    fixed  fixed       10367.

5          -13048.845        0.5000    fixed  fixed       14555.

6          -12763.346        1.0000    fixed  fixed       33267.

7          -12572.870        1.0000    fixed  fixed      0.24578E+06

8          -12510.828        1.0000    fixed  fixed      0.27798E+06

9          -12509.685        1.0000    fixed  fixed      0.29546E+06

10          -12509.684        1.0000    fixed  fixed      0.31576E+06

11          -12509.684        1.0000    fixed  fixed

<S> dis m

X-vars            Y-var             Case-var

________________________________________________

r1                y                 idcode

r1_age

r1_age2

r1_black

r1_msp

r1_not_smsa

r1_south

r2

r2_black

r2_msp

r2_not_smsa

r2_south

r2_union

r2_tenure

Correlated bivariate model

Standard probit/linear

Gaussian random effects

Number of observations             =   37990

Number of cases                    =    4132

X-var df           =    16

Sigma df           =     1

Scale df           =     3

Log likelihood =     -12509.684     on   37970 residual degrees of freedom

<S> dis e

Parameter              Estimate         Std. Err.

___________________________________________________

r1                     -2.5917          0.38583

r1_age                 0.20746E-01      0.23576E-01

r1_age2               -0.19613E-03      0.37653E-03

r1_black               0.82344          0.68959E-01

r1_msp                -0.69428E-01      0.40873E-01

r1_not_smsa           -0.12465          0.59246E-01

r1_south              -0.75019          0.57893E-01

r2                     0.76063          0.27767E-01

r2_black              -0.76520E-01      0.11583E-01

r2_msp                -0.40225E-02      0.58159E-02

r2_not_smsa           -0.14113          0.87677E-02

r2_south              -0.75922E-01      0.86025E-02

r2_union               0.99590E-01      0.70841E-02

r2_tenure              0.28353E-01      0.64041E-03

sigma2                 0.26013          0.15133E-02

scale1                  1.4569          0.35410E-01

scale2                 0.28009          0.39974E-03

corr                   0.99571E-01      0.24923E-02

<S> stop