Example L4 Binary response model

 

We analyses a version of the NLSY data as used in various Stata Manuals (to illustrate the xt commands). The data is for young women who were aged 14-26 in 1968. The women were surveyed each year from 1970 to 1988, except for 1974, 1976, 1979, 1981, 1984 and 1986. We have removed records with missing values on one or more of the response and explanatory variables we want use in our analysis of the joint determinants of wages and trade union membership. In this example we model trade union membership. There are 4132 women (idcode) with between 1 and 12 years of observation on wages being in employment (i.e. not in full time education) and earning more than $1/hour but less than $700/hour.

 

Reference

 

Stata Longitudinal/Panel Data, Reference Manual, Release 9, (2005), Stata Press, StataCorp LP, College Station, Texas.

 

Data description

 

Number of observations (rows): 18995

Number of variables (columns): 20

 

The subset of variables we use are:

 

ln_wage=ln(wage/GNP deflator) in a particular year are:

black=1 if woman is black, 0 otherwise;

msp=1 if woman is married and spouse is present, 0 otherwise;

grade= years of schooling completed; (0-18);

not_smsa=1 if woman was living outside a standard metropolitan statistical area (smsa), 0 otherwise;

south=1 if the woman was living in the South, 0 otherwise;

union=1 if a member of a trade union, 0 otherwise;

tenure= job tenure in years (0-26).

age= respondents age

age2 = age* age

 

 

 

 

The first few lines of nls.dat look like

 

 

 

Sabre commands

 

out union.log

trace union.trace

data idcode year birth_yr age race msp nev_mar grade collgrad not_smsa &

     c_city south union ttl_exp tenure ln_wage black age2 ttl_exp2 tenure2

read nls.dat

case idcode

yvar union

link p

constant cons

lfit age age2 black msp grade not_smsa south cons

dis m

dis e

fit age age2 black msp grade not_smsa south cons

dis m

dis e

stop

 

 

 

Sabre log file

 

<S> trace union.trace

<S> data idcode year birth_yr age race msp nev_mar grade collgrad not_smsa &

<S>      c_city south union ttl_exp tenure ln_wage black age2 ttl_exp2 tenure2

<S> read nls.dat

 

      18995 observations in dataset

 

<S> case idcode

<S> yvar union

<S> link p

<S> constant cons

<S> lfit age age2 black msp grade not_smsa south cons

 

    Iteration       Log. lik.       Difference

    __________________________________________

        1          -13166.331

        2          -9993.4445        3173.

        3          -9936.0591        57.39

        4          -9935.7612       0.2979

        5          -9935.7611       0.4760E-04

 

<S> dis m

 

    X-vars            Y-var

    ______________________________

    cons              union

    age

    age2

    black

    msp

    grade

    not_smsa

    south

 

    Univariate model

    Standard probit

 

    Number of observations             =   18995

 

    X-var df           =     8

 

    Log likelihood =     -9935.7611     on   18987 residual degrees of freedom

 

<S> dis e

 

    Parameter              Estimate         Std. Err.

    ___________________________________________________

    cons                   -1.3430          0.23760

    age                    0.12788E-01      0.15521E-01

    age2                  -0.10605E-03      0.24659E-03

    black                  0.48206          0.24334E-01

    msp                   -0.20820E-01      0.21552E-01

    grade                  0.31364E-01      0.44733E-02

    not_smsa              -0.75475E-01      0.24045E-01

    south                 -0.49752          0.23085E-01

 

<S> fit age age2 black msp grade not_smsa south cons

 

    Initial Homogeneous Fit:

 

    Iteration       Log. lik.       Difference

    __________________________________________

        1          -13166.331

        2          -9993.4445        3173.

        3          -9936.0591        57.39

        4          -9935.7612       0.2979

        5          -9935.7611       0.4760E-04

 

 

    Iteration       Log. lik.         Step      End-points     Orthogonality

                                     length    0          1      criterion

    ________________________________________________________________________

        1          -7942.0802        1.0000    fixed  fixed       461.28

        2          -7654.1516        1.0000    fixed  fixed       295.89

        3          -7647.3294        1.0000    fixed  fixed       587.17

        4          -7647.1214        1.0000    fixed  fixed       366.94

        5          -7647.1026        1.0000    fixed  fixed       626.53

        6          -7647.1002        1.0000    fixed  fixed       652.15

        7          -7647.0997        1.0000    fixed  fixed       260.38

        8          -7647.0997        1.0000    fixed  fixed

 

<S> dis m

 

    X-vars            Y-var             Case-var

    ________________________________________________

    cons              union             idcode

    age

    age2

    black

    msp

    grade

    not_smsa

    south

 

    Univariate model

    Standard probit

    Gaussian random effects

 

    Number of observations             =   18995

    Number of cases                    =    4132

 

    X-var df           =     8

    Scale df           =     1

 

    Log likelihood =     -7647.0997     on   18986 residual degrees of freedom

 

<S> dis e

 

    Parameter              Estimate         Std. Err.

    ___________________________________________________

    cons                   -2.5916          0.38587

    age                    0.22417E-01      0.23566E-01

    age2                  -0.22314E-03      0.37641E-03

    black                  0.82324          0.68871E-01

    msp                   -0.71011E-01      0.40905E-01

    grade                  0.69085E-01      0.12453E-01

    not_smsa              -0.13402          0.59397E-01

    south                 -0.75488          0.58043E-01

    scale                   1.4571          0.35516E-01

 

<S> stop