An Introduction to Multivariate Multilevel GLMs Hello and welcome.

11
An Introduction to Multivariate Multilevel GLMs Hello and welcome

Transcript of An Introduction to Multivariate Multilevel GLMs Hello and welcome.

Page 1: An Introduction to Multivariate Multilevel GLMs Hello and welcome.

An Introduction to Multivariate Multilevel GLMs

Hello and welcome

Page 2: An Introduction to Multivariate Multilevel GLMs Hello and welcome.

Introduction

• Multilevel multiprocess models provide an extremely flexible approach to the analysis of a wide array of social science data.

• Multilevel modelling allows for the analysis of dependent or clustered data where observations are nested within groups e.g. unemployment of individuals in the same travel to work area.

• Most software is limited to single equation systems, unfortunately the social world is not this simple.

• Multiprocess modelling allows for correlations in unobservables between different responses, e.g. educational attainment and log wages.

Page 3: An Introduction to Multivariate Multilevel GLMs Hello and welcome.

Introduction

• Multilevel multiprocess analyses involve variables measured at more than one level of a hierarchy.

• An obvious hierarchy in education consists of english and maths attainment (bivariate response) for students nested in school classes, and classes nested in schools.

• Sabre 5.0 can estimate models for up to 3 simultaneous responses for clustered or panel data. Explanatory variables for the responses can include student characteristics, class and teacher characteristics, or school characteristics.

Page 4: An Introduction to Multivariate Multilevel GLMs Hello and welcome.

Introduction

• Sabre 5 uses quadrature to integrate out the random or unobserved effects

• Quadrature is flexible as it can be used with any model, what ever the form

• Not limited to analytic results, Poisson~gamma (NBD) or Normal~Normal

• Can model simultaneous equation systems, with combinations of response types, e.g. binary response, and Poisson

• In our comparisons Sabre 5 seems to outperform a range of commercial and other software systems for the same/similar models

• Real advantage of Sabre 5 is that we can go parallel for the analysis of large (data/model) systems on the UK GRID

Page 5: An Introduction to Multivariate Multilevel GLMs Hello and welcome.
Page 6: An Introduction to Multivariate Multilevel GLMs Hello and welcome.

Example and Data Set Model Obs Cases at Level 2, (3)

Explan Vars

Size Sabre (1) Stata gllamm IGLS MCMC

2 Level ModelsGHQ [ghq2.dat] Linear 24 12 2 1.47Kb 00'00" 00'00" 00'13" 00'00" 00'01"GROWTH [growth.dat] Linear 153 51 4 13.2Kb 00'00" failed 00'24" 00'00" 00'02"HSB [hsb.dat] Linear 7185 160 5 1.15Mb 00'02" 00'00" 06'05" 00'02" 00'50"WAGEPAN [wagepan.dat] Linear 4360 545 15 2.02Mb 00'04" 00'00" 13'40" 00'02" 00'56"NLS_WAGE [nls.dat] Linear 18995 4132 8 3.76Mb 00'09" 00'00" 51'12" 00'03" 03'37"THAIEDUC [thaieduc2.dat] Binary Logit 7516 356 4 411Kb 00'00" 00'02" 00'52" 00'01" 05'53"UNIONPAN [wagepan.dat] Binary Logit 4360 545 18 2.02Mb 00'01" 00'25" 08'21" 00'05" 09'39"ESSAYS [essays2.dat] Binary Probit 990 198 11 172Kb 00'00" 00'02" 00'21" 00'01" 02'11"NLS_UNION [nls.dat] Binary Probit 18995 4132 8 3.76Mb 00'03" 01'18" 16'16" 00'19" 31'48"RESPIRATORY [respiratory2.dat] Ordered Logit 555 111 6 118Kb 00'00" NA 00'53" 00'02" 02'20"ESSAYS_ORDERED [essays_ordered.dat] Ordered Probit 990 198 12 182Kb 00'00" NA 00'40" 00'01" 03'45"ROCH [roch2.dat] c-log-log 6349 348 10 669Kb 00'01" 00'15" 03'51" 00'04" 12'22"FILLED-B [filled-b.dat] c-log-log 11341 2374 12 2.99Mb 00'02" 00'25" 13'25" 00'15" 25'30"EPILEP [epilep.dat] Poisson 236 59 6 23.2Kb 00'06" 00'01" 00'18" 00'02" 00'13"VISITS [racd.dat] Poisson 5190 5190 13 1.06Mb 00'01" 00'15" 02'49" 00'21" 07'33"

3 Level ModelsRodriguez and Goldman (2001) simulation study [s3bb1.dat-s3bb25.dat] Binary Logit 2449 (x25) 1558 (161) 4 143Kb (x25) 01'52" NA 239'32" 01'27" 91'15"

Notes Uses analytic marginal likelihood

Comparison 3 on the Sabre web site

Page 7: An Introduction to Multivariate Multilevel GLMs Hello and welcome.

Web site http://sabre.lancs.ac.uk/

Page 8: An Introduction to Multivariate Multilevel GLMs Hello and welcome.

Sabre 5.0 (Multilevel Multivariate GLMs)

• Serial and parallel versions, source code available for download from the sabre site

• Sabre features– 3 levels for univariate GLMs– 3 dimensional 2-level GLMs

• Sabre site still written for the Sabre 5.0 stand alone version, will be augmented with the sabreR stuff RSN

• Sabre uses analytical 1st and 2nd derviatives in its Newton Raphson optimization procedures

Page 9: An Introduction to Multivariate Multilevel GLMs Hello and welcome.

Sabre-Stata

• We have a demo version of this (not being released)• Ok for desktop sabre 5.0 jobs, but not easily extended to

submit jobs to the Grid• Problems with the grid submission from Stata, Stata can

only have 1 data set open at a time

Page 10: An Introduction to Multivariate Multilevel GLMs Hello and welcome.

sabreR

• R is free software, a community of statisticians maintain the code and continuously update  the programme on a voluntary basis. – R is extremely flexible (has become a de-facto standard among

statisticians for the development of statistical software).

– The approach is strictly object oriented: everything is an object: data, matrices, results, functions etc with "properties" and "methods" and is classified in "classes" . 

– R is also highly extensible through the use of packages, which are user-submitted libraries for specific functions or specific areas of study (now includes the sabreR library)

– We will be adding libraries to enable grid job submission and monitoring of a grid sabre job from within your desktop R environment (interim solution available now)

– R is more flexible that Stata

Page 11: An Introduction to Multivariate Multilevel GLMs Hello and welcome.

Tuesday, 17th July 07 Chair

13:30-13:40 Session 0 (10 mins) Introduction to Multilevel Multiprocess Models Rob Crouchley13:40-15:10 Session 1 (90 mins) Introduction to Computing System, Software: R, SABRE Dan Grose15:10-15:40 Session 2 (30 mins) Linear Models I: A Two Level Model Damon Berridge15:40-16:00 (20 mins)16:00-16:40 Session 3 (40 mins) Linear Models II: Random Intercept Models Damon Berridge16:40-17:10 Session 4 (30 mins) Multilevel Binary Response Models Damon Berridge17:10-17:30 Session 5 (20 mins) Multilevel Ordered Response Models Damon Berridge17:30-18:00 Session 6 (30 mins) Multilevel Poisson Models Damon Berridge

Wednesday, 18th July 07

09:30-09:50 Session 7 (20 mins) Multilevel Generalised Linear Models (GLMs) Damon Berridge09:50-10:20 Session 8 (30 mins) Three Level GLMs Damon Berridge10:20-11:00 Session 9 (40 mins) Multilevel Multivariate GLMs Rob Crouchley11:00-11:20 (20 mins)11:20-12:50 Session 10 (90 mins) Event History Models Damon Berridge12:50-13:50 (60 mins)13:50-15:20 Session 11 (90 mins) State Dependence, Heterogeneity and Nonstationarity Rob Crouchley15:20-15:50 (20 mins)15:50-16:30 Session 12 (40 mins) Grid Taster and Questions Rob Allan

Thursday, 19th July 07

09:30-10:10 Session 13 (40 mins) Introduction to Grid and NW-GRID John Kewley 10:10-10:40 Session 14 (30 mins) How to Get Access, Basic Grid Security + Certificates John Kewley 10:40-11:00 (20 mins)11:00-12:00 Session 15 (60 mins) Sabre and R on the Grid Dan Grose12:00-13:00 (60 mins)13:00-14:00 Session 16 (60 mins) Tutorial ... gulp! ... if ready ALL14:00-15:00 Session 17 (60 mins) Future Stuff, Questions & Close Rob Crouchley

Lunch

Break

Break

Lunch

Break

Break

Title

An Introduction to Multivariate Multilevel GLMs using Sabre 5.0