Exercise L7, duration
data
In 1986, the ESRC funded the Social Change and Economic Life
Initiative (SCELI). Under this initiative work and life histories were
collected for a sample of individuals from 6 different geographical areas in
the
Data description
Number of observations (rows): 6349
Number of variables (columns): 10
Variables:
case= respondent number
move= residential move (0=no;
1=yes)
dur=number of years since last move
mbu=
marriage break-up during the year (0=no; 1=yes)
fm= first marriage during the year
(0=no; 1=yes)
mar= married at the beginning of
the year (0=no; 1=yes)
emp= employment at the beginning
of the year (1=self employed; 2=employee; 3=not working)
age= (age-30) years
emp2=1 if employment at the beginning of the year is employee; 0 otherwise
emp3=1 if employment at the beginning of the year is not working; 0 otherwise
Note that the variable dur, which
measures the number of years since the last move is endogenous, i.e. it is internally
related to the process of interest.
The first few lines of roch2.dat look like
Start Sabre and specify transcript file:
out roch.log
data case move dur mbu fm mar emp age emp2 emp3
read roch2.dat
Suggested Exercise
(1)
Create quadratic (age2) and cubic (age3) terms in age to allow more flexibility
in modelling this variable (i.e. to allow for a non-linear relationship).
(2)
Specify the binary response variable (move) and fit a cloglog
model to the explanatory variables age dur fm mbu mar emp2 emp3. Add the age2 and age3 effects to this
mode, are they significant. What does this tell you about residential mobility.
(3) Add the case random effect to the model estimated in part 2, is it significant? How many quadrature points should we use to estimate this model? Interpret you results. Can the model be simplified?
How do things change between the independent and random
effect model?