Comparative Performance

Comparison 3

Sabre and MLwin comparison

The Relative Performance of Different Software Packages for Estimating Multilevel Models

In this section we compare Sabre, Stata, gllamm (Stata), IGLS (MLwiN) and MCMC (MLwiN).

Different procedures are used to estimate the multilevel models in the different packages.

Sabre, Stata and gllamm (Stata) use Gaussian quadrature, while MLwiN provides several alternative estimation procedures. These include MQL/PQL (Breslow and Clayton, 1993) and MCMC Gibbs sampling. In MQL/PQL Taylor expansions are used to linearise the relationship between responses and the linear predictors.

There are two dimensions to this comparison: (1) the length of time taken to estimate the model and (2) the numerical properties of the estimates.

Software comparisons

There are various comparisons of the different procedures/softwares in the literature:

1. Rabe Hesketh et al (2004, Table 9.2) compares gllamm (quadrature, 12 points) with MLwiN (MQL-2, PQL did not converge).

2. Browne and Draper (2006, Table 8) compare MLwin MQL-1, PQL-2 with gamma and uniform priors in Gibbs sampling.

3. Rodriguez and Goldman (2001, Table 1) compare MQL-1, MQL-2 and PQL-2

However, we were unable to find a comparison of quadrature based methods with Gibbs sampling.

For our comparisons we use a mixture of empirical examples and simulations of different size and model complexity from the multilevel modelling literature. They can all be downloaded from this site.

In all comparisons we use the default or recommended starting values of the different procedures. We only report the serial Sabre results in these comparisons.

Computational time

The SABRE, Stata, gllamm (Stata) models were fitted on Lancaster’s HPC. The MLwiN models, estimated using IGLS and MCMC were fitted on a PC as MLwiN is restricted to systems running Windows.

The HPC execution nodes are 124 Sun Fire X4100 servers with two dual-core 2.4GHz Opteron CPUs, for a total of 4 CPUs per node. The standard memory per node is 8G, with a few nodes offering 16G. Most nodes also offer dedicated inter-processor communication in the form of SCore over gigabit Ethernet, to support message passing (parallel) applications.

The PC we used is a AMD Athlon 64 Processor 3400+, 990 MHz, 480 Mb of RAM, with a Physical Address Extension running Microsoft Windows XP, Professional (SP2) and with a 10.0 Gb System Disk (C:) and a 51.5 Gb Data Disk (D:).

To asses the relative performance of the HPC and PC we ran several SABRE programs on both the HPC and the PC. We found the HPC about twice as fast as the PC. The times we report in the relative performance table are the actual cpu times.

The commands, data and results for each example are in a downloadable zip file .

Sabre is always faster than MCMC and gllamm. For linear models estimated using Stata this is not a fair comparison as Stata uses the closed form of the likelihood, i.e. it does not use quadrature, so it will always be faster.

The bigger the data set, or the more complex the model, the better the relative performance of serial Sabre. In the Rodriguez and Goldman (1995, 2001) simulated data example, Sabre was 48.8 times faster than MCMC. If we scale this to allow for the relative performance of the computers it reduces to 24.4 times faster. We regard all the examples above as small and medium sized.

We are in the process of comparing the performance of the different systems for multiprocess multilevel models on several large data sets and will report these results here in the near future.

Numerical Properties of the estimates

The results for each software package (estimation procedure) are presented in a table on each link.

For linear and non linear models the different quadrature based methods, when available (Sabre, Stata and gllamm (Stata)) give the same estimates and standard errors.

For linear models the MLwiN IGLS procedure gives the same answer as the quadrature based methods. The MCMC procedure sometimes gives slightly different answers, but these are essentially the same.

For nonlinear models the MCMC and Sabre procedures often give similar results. There are however one or two exceptions, for instance the MCMC estimates on the FILLED-B data set (c-loglog link), seem to be more different to those of the quadrature based methods. Similarly with those of IGLS.

The best way to asses the numerical behaviour of the different software packages and estimating procedures is on simulated data, as in this case we know what the correct answer should be. We use the 1^st 25 simulated data sets from Rodriguez and Goldman (2001). The sample means of IGLS (PQL2) are much worse that those of Sabre, gllamm and MCMC. The Mean Squared Errors of the Sabre and gllamm estimates are slightly better than those of MCMC. The coverage of Sabre and gllamm are much better than those of IGLS and just slightly better than those of MCMC. However the slight difference between quadrature based methods and MCMC may disappear with a larger set of simulations or with different models.

These comparisons on empirical and simulated data suggest that Sabre is a good system to use in parallel for large and complex models. The numerical properties of Sabre’s estimates compare favourably with those of the alternatives and it has the best overall computational speed.

References

Breslow, N.E., and Clayton, D., (1993), Approximate inference in generalised linear mixed models. JASA, 88, 9-25.

Browne, W. J., and Draper, D., (2006), A comparison of Bayesian and likelihood-based methods for fitting multilevel models. To appear (with discussion) in Bayesian Analysis

Gelman, A., Carlin, J.B., Stern, H.S., and Rubin, D.B., (2003),. Bayesian Data Analysis, 2nd Edition. Chapman and Hall/CRC, Boca Raton, FL.

Rabe-Hesketh, S., Skrondal, A. and Pickles, A. (2004), GLLAMM Manual. U.C. Berkeley Division of Biostatistics Working Paper Series. Working Paper 160.. Downloadable from http://www.gllamm.org/docum.html

Rodriguez, B., and Goldman, N., (1995), An assessment of estimation procedures for multilevel models with binary responses, JRSS, A, 158, 73-89.

Rodriguez, G., and Goldman, N., (2001), Improved estimation procedures for multilevel models with binary response: a case study. Journal of the Royal Statistical Society, A 164, 339–355.