Data preparation in Stata: Reshaping data

Sabre manual


Dunn (1992) reported data for the 12-item version of Goldberg's (1972) General Health Questionnaire for psychological distress. The questionnaire was completed by 12 students on 2 dates, 3 days apart. The data are repeated in the table below, the same data were used by Rabe-Hesketh and Skrondal (2005, exercise 1.2).


Data description


Number of observations (rows): 12
Number of variables (columns): 3




student: student identifier {1,2,...,12}
ghq1: psychological distress score at occasion 1
ghq2: psychological distress score at occasion 2




Stata dataset ghq.dta


The ghq.dta dataset contains variables ghq1 and ghq2 giving the psychological distress score for students on two separate occasions. To reshape the data from wide into long format and create a single score variable ghq, we can use


reshape long ghq, i(student) j(r)
tab r, gen(r)
sort student r
rename r1 dg1
rename r2 dg2
save ghq2, replace


This also creates a response indicator variable r, the associated dummy variables dg1 and dg2 and saves the file ghq2.dta.




r: response occasion 1, 2
student: student identifier {1,2,,12}
ghq: psychological distress score at occasion
dg1: 1, if the response occasion is 1, 0 otherwise
dg2: 1, if the response occasion is 2, 0 otherwise


The data set was saved as ghq2.dta






This data set can now be read directly into Sabre, see for example Exercise L1.


Go to: Sabre home page | Sabre manual | Downloading & Installing Sabre | Sabre examples | Training materials | Sabre mailing list | Contact us

Other links: Centre for e-Science | Centre for Applied Statistics