sabrelogo

Sabre

Data preparation in Stata: Missing values

Sabre manual

 

Raudenbush and Bhumirat (1992) analysed data on children repeating a grade during their time at primary school. The data were from a national survey of primary education in Thailand in 1988, we use a sub set of that data here.

 

Reference

 

Raudenbush, S.W., Bhumirat, C., 1992. The distribution of resources for primary education and its consequences for educational achievement in Thailand, International Journal of Educational Research, 17, 143-164

 

Data description

 

Number of observations (rows): 8582
Number of variables (columns): 5

 

Variables

 

schoolid :  school identifier
sex: 1 if child is male, 0 otherwise
pped: 1 if the child had pre primary experience, 0 otherwise
repeat: 1 if the child repeated a grade during primary school, 0 otherwise
msesc: mean pupil socio economic status at the school level

 

The first few lines of thaieduc.dta

 

This shows that the thaieduc.dta dataset contains a variable msesc which has missing values. For models which do not use msesc, we can simply drop this variable from the dataset as follows

 

use thaieduc
drop msesc
save thaieduc1, replace

This dataset has 8,582 observations on 4 variables. For models which do use
msesc we need to drop all of the missing values. To do this, we can use

 

use thaieduc
drop if msesc ==.
save thaieduc2, replace

 

This dataset has 7,516 observations on 5 variables. This data set can now be read directly into Sabre, see for example, Example C3.

Go to: Sabre home page | Sabre manual | Downloading & Installing Sabre | Sabre examples | Training materials | Sabre mailing list | Contact us

Other links: Centre for e-Science | Centre for Applied Statistics