Example L4 Binary response model

We analyses a version of the NLSY data as used in various Stata Manuals (to illustrate the xt commands). The data is for young women who were aged 14-26 in 1968. The women were surveyed each year from 1970 to 1988, except for 1974, 1976, 1979, 1981, 1984 and 1986. We have removed records with missing values on one or more of the response and explanatory variables we want use in our analysis of the joint determinants of wages and trade union membership. In this example we model trade union membership. There are 4132 women (idcode) with between 1 and 12 years of observation on wages being in employment (i.e. not in full time education) and earning more than \$1/hour but less than \$700/hour.

Reference

Stata Longitudinal/Panel Data, Reference Manual, Release 9, (2005), Stata Press, StataCorp LP, College Station, Texas.

Data description

Number of observations (rows): 18995

Number of variables (columns): 20

The subset of variables we use are:

ln_wage=ln(wage/GNP deflator) in a particular year are:

black=1 if woman is black, 0 otherwise;

msp=1 if woman is married and spouse is present, 0 otherwise;

grade= years of schooling completed; (0-18);

not_smsa=1 if woman was living outside a standard metropolitan statistical area (smsa), 0 otherwise;

south=1 if the woman was living in the South, 0 otherwise;

union=1 if a member of a trade union, 0 otherwise;

tenure= job tenure in years (0-26).

age= respondents age

age2 = age* age

The first few lines of nls.dat look like

Sabre commands

out union.log

trace union.trace

data idcode year birth_yr age race msp nev_mar grade collgrad not_smsa &

c_city south union ttl_exp tenure ln_wage black age2 ttl_exp2 tenure2

case idcode

yvar union

constant cons

lfit age age2 black msp grade not_smsa south cons

dis m

dis e

fit age age2 black msp grade not_smsa south cons

dis m

dis e

stop

Sabre log file

<S> trace union.trace

<S> data idcode year birth_yr age race msp nev_mar grade collgrad not_smsa &

<S>      c_city south union ttl_exp tenure ln_wage black age2 ttl_exp2 tenure2

18995 observations in dataset

<S> case idcode

<S> yvar union

<S> constant cons

<S> lfit age age2 black msp grade not_smsa south cons

Iteration       Log. lik.       Difference

__________________________________________

1          -13166.331

2          -9993.4445        3173.

3          -9936.0591        57.39

4          -9935.7612       0.2979

5          -9935.7611       0.4760E-04

<S> dis m

X-vars            Y-var

______________________________

cons              union

age

age2

black

msp

not_smsa

south

Univariate model

Standard probit

Number of observations             =   18995

X-var df           =     8

Log likelihood =     -9935.7611     on   18987 residual degrees of freedom

<S> dis e

Parameter              Estimate         Std. Err.

___________________________________________________

cons                   -1.3430          0.23760

age                    0.12788E-01      0.15521E-01

age2                  -0.10605E-03      0.24659E-03

black                  0.48206          0.24334E-01

msp                   -0.20820E-01      0.21552E-01

not_smsa              -0.75475E-01      0.24045E-01

south                 -0.49752          0.23085E-01

<S> fit age age2 black msp grade not_smsa south cons

Initial Homogeneous Fit:

Iteration       Log. lik.       Difference

__________________________________________

1          -13166.331

2          -9993.4445        3173.

3          -9936.0591        57.39

4          -9935.7612       0.2979

5          -9935.7611       0.4760E-04

Iteration       Log. lik.         Step      End-points     Orthogonality

length    0          1      criterion

________________________________________________________________________

1          -7942.0802        1.0000    fixed  fixed       461.28

2          -7654.1516        1.0000    fixed  fixed       295.89

3          -7647.3294        1.0000    fixed  fixed       587.17

4          -7647.1214        1.0000    fixed  fixed       366.94

5          -7647.1026        1.0000    fixed  fixed       626.53

6          -7647.1002        1.0000    fixed  fixed       652.15

7          -7647.0997        1.0000    fixed  fixed       260.38

8          -7647.0997        1.0000    fixed  fixed

<S> dis m

X-vars            Y-var             Case-var

________________________________________________

cons              union             idcode

age

age2

black

msp

not_smsa

south

Univariate model

Standard probit

Gaussian random effects

Number of observations             =   18995

Number of cases                    =    4132

X-var df           =     8

Scale df           =     1

Log likelihood =     -7647.0997     on   18986 residual degrees of freedom

<S> dis e

Parameter              Estimate         Std. Err.

___________________________________________________

cons                   -2.5916          0.38587

age                    0.22417E-01      0.23566E-01

age2                  -0.22314E-03      0.37641E-03

black                  0.82324          0.68871E-01

msp                   -0.71011E-01      0.40905E-01