A Handbook of Statistical Analyses Using Stata Rabe-Hesketh S, Everitt B (2004) ISBN 1584884045; 328...

1
A Handbook of Statistical Analyses Using Stata Rabe-Hesketh S, Everitt B (2004) ISBN 1584884045; 328 pages; £28.99, $49.95 CRC; http://www.crcpress.com/ You are probably familiar with the story of the gorillas, the bananas, and the hosepipe. But for those who need reminding, if you put five gorillas in a cage with a ladder in the middle and a bunch of bananas suspended above it, and then spray the other gorillas with water when one climbs the ladder to get the bananas, the gorillas soon learn to stop any of them who tries to climb the ladder. If you then take away one of the gorillas and replace it with a new one, the new gorilla will try to get the bananas, but will soon be stopped by the others. Take out a second gorilla and put in a second new one, and the same thing happens. Even the first new gorilla joins in, having learnt by now that climbing the ladder is just not on. Keep going, and you eventually have five gorillas who have never been hosed down but who will not go near the ladder. They all know that climbing the ladder is strictly forbidden, but none of them knows why. The story goes that this is how 80% of company policy works: ‘we do it like this because we have always done it like this’. I suspect that the same principle operates in many pharmaceutical statisticians’ choice of software: ‘we use SAS because we have always used SAS’. SAS is, of course, not the only software that can be used for statistical analyses, and this book provides a useful introduction to using one of its main competitors, Stata. The book has a very specific purpose: to show how various types of statistical analyses can be done using Stata. Although it gives a brief description of the statistical techniques used, the main focus is on using Stata. It is therefore a substitute neither for a general statistical text nor the official Stata manuals, but is an excellent introduction to Stata for anyone already familiar with common statistical techniques. The authors state that they hope the book ‘will provide a useful complement to the excellent but very extensive Stata manuals’, and I believe they have succeeded in that aim. The first chapter, entitled ‘A brief introduction to Stata’, does exactly what it says. Anyone who has not used Stata before will find this chapter quickly gets them up and running. It explains the main features of Stata, such as how to enter commands either at the keyboard or in a program file (known as a ‘do file’ in Stata), saving and using datasets, creating log files, getting help, and accessing the extensive collection of user-written add- on programs to Stata that can be freely downloaded from the internet. It also covers the basics of Stata language syntax, data- management commands, general features of statistical estima- tion with Stata, and drawing graphs. (The graphics in Stata, incidentally, are absolutely top-notch.) The remaining chapters cover various types of analyses, starting from producing simple descriptive statistics and ranging through linear and logistic regression, ANOVA, generalized linear models, epidemiological techniques, and survival analysis, to more sophisticated techniques such as random-effects models, generalized estimating equations, clus- ter analysis, and maximum-likelihood estimation based on user- specified likelihood functions. Each chapter begins by describ- ing a specific dataset and the desired analysis, and then shows the Stata commands needed to run the analysis and the output that Stata produces. All the datasets can be downloaded from the internet, so that the reader can replicate the analyses. The authors have a difficult balance to strike in the level of detail they use to explain the statistical techniques, and their choice will not please everyone, but in my opinion they have done a reasonable job given the intended focus of the book. They describe the main features of each technique, including giving algebraic forms of the statistical models used, but I suspect it would be hard to follow the descriptions if you are not already at least partly familiar with the statistical methods. Consider, for example, a description of the theory of random effects regression in just over two pages. This book is therefore most useful to jobbing statisticians wishing to learn about Stata, rather than to students wishing to learn about the statistical techniques described, but doubtless it would help students to reinforce their newly learned statistical techniques if the main learning has been done elsewhere. One particularly useful feature of the book is that the output produced by Stata is described in minute detail. Whether you simply need to know where the parameter estimates are in a simple linear regression output or distinguish the standard deviation of individual residuals from that of random intercepts in a random-effects regression, this book shows you where to look. This would make the book invaluable for anyone who needs to read and interpret Stata output, even if they do not run the analyses themselves. The range of material covered means that the book will be useful not only to new users of Stata, but also to more experienced users if they are attempting one of the more advanced techniques described in the book. Even as a reason- ably advanced Stata user, I have found the book useful myself on more than one occasion when interpreting some of the more sophisticated types of output. I would unhesitatingly recommend this book to any statistician who uses Stata, but especially to those who are new to Stata. I would also recommend that any pharmaceutical statistician who feels like a change from SAS gives Stata a try. You might find that you end up with the statistician’s equivalent of a delicious bunch of bananas without being sprayed with cold water. Adam Jacobs Dianthus Medical Limited, UK E-mail: [email protected] (DOI: 10.1002/pst. 194) Copyright # 2005 John Wiley & Sons, Ltd. Pharmaceut. Statist. 2005; 4: 297–300 Book reviews 300

Transcript of A Handbook of Statistical Analyses Using Stata Rabe-Hesketh S, Everitt B (2004) ISBN 1584884045; 328...

A Handbook of Statistical Analyses Using Stata

Rabe-Hesketh S, Everitt B (2004)

ISBN 1584884045; 328 pages; £28.99, $49.95

CRC; http://www.crcpress.com/

You are probably familiar with the story of the gorillas, the

bananas, and the hosepipe. But for those who need reminding,

if you put five gorillas in a cage with a ladder in the middle and

a bunch of bananas suspended above it, and then spray the

other gorillas with water when one climbs the ladder to get the

bananas, the gorillas soon learn to stop any of them who tries

to climb the ladder. If you then take away one of the gorillas

and replace it with a new one, the new gorilla will try to get the

bananas, but will soon be stopped by the others. Take out a

second gorilla and put in a second new one, and the same thing

happens. Even the first new gorilla joins in, having learnt by

now that climbing the ladder is just not on. Keep going, and

you eventually have five gorillas who have never been hosed

down but who will not go near the ladder. They all know that

climbing the ladder is strictly forbidden, but none of them

knows why.

The story goes that this is how 80% of company policy

works: ‘we do it like this because we have always done it like

this’. I suspect that the same principle operates in many

pharmaceutical statisticians’ choice of software: ‘we use SAS

because we have always used SAS’.

SAS is, of course, not the only software that can be used for

statistical analyses, and this book provides a useful introduction

to using one of its main competitors, Stata. The book has a very

specific purpose: to show how various types of statistical

analyses can be done using Stata. Although it gives a brief

description of the statistical techniques used, the main focus is

on using Stata. It is therefore a substitute neither for a general

statistical text nor the official Stata manuals, but is an excellent

introduction to Stata for anyone already familiar with common

statistical techniques. The authors state that they hope the book

‘will provide a useful complement to the excellent but very

extensive Stata manuals’, and I believe they have succeeded in

that aim.

The first chapter, entitled ‘A brief introduction to Stata’, does

exactly what it says. Anyone who has not used Stata before will

find this chapter quickly gets them up and running. It explains

the main features of Stata, such as how to enter commands

either at the keyboard or in a program file (known as a ‘do file’

in Stata), saving and using datasets, creating log files, getting

help, and accessing the extensive collection of user-written add-

on programs to Stata that can be freely downloaded from the

internet. It also covers the basics of Stata language syntax, data-

management commands, general features of statistical estima-

tion with Stata, and drawing graphs. (The graphics in Stata,

incidentally, are absolutely top-notch.)

The remaining chapters cover various types of analyses,

starting from producing simple descriptive statistics and

ranging through linear and logistic regression, ANOVA,

generalized linear models, epidemiological techniques, and

survival analysis, to more sophisticated techniques such as

random-effects models, generalized estimating equations, clus-

ter analysis, and maximum-likelihood estimation based on user-

specified likelihood functions. Each chapter begins by describ-

ing a specific dataset and the desired analysis, and then shows

the Stata commands needed to run the analysis and the output

that Stata produces. All the datasets can be downloaded from

the internet, so that the reader can replicate the analyses.

The authors have a difficult balance to strike in the level of

detail they use to explain the statistical techniques, and their

choice will not please everyone, but in my opinion they have

done a reasonable job given the intended focus of the book.

They describe the main features of each technique, including

giving algebraic forms of the statistical models used, but I

suspect it would be hard to follow the descriptions if you are

not already at least partly familiar with the statistical methods.

Consider, for example, a description of the theory of random

effects regression in just over two pages. This book is therefore

most useful to jobbing statisticians wishing to learn about

Stata, rather than to students wishing to learn about the

statistical techniques described, but doubtless it would help

students to reinforce their newly learned statistical techniques if

the main learning has been done elsewhere.

One particularly useful feature of the book is that the output

produced by Stata is described in minute detail. Whether you

simply need to know where the parameter estimates are in a

simple linear regression output or distinguish the standard

deviation of individual residuals from that of random intercepts

in a random-effects regression, this book shows you where to

look. This would make the book invaluable for anyone who

needs to read and interpret Stata output, even if they do not run

the analyses themselves.

The range of material covered means that the book will be

useful not only to new users of Stata, but also to more

experienced users if they are attempting one of the more

advanced techniques described in the book. Even as a reason-

ably advanced Stata user, I have found the book useful myself

on more than one occasion when interpreting some of the more

sophisticated types of output.

I would unhesitatingly recommend this book to any

statistician who uses Stata, but especially to those who are

new to Stata. I would also recommend that any pharmaceutical

statistician who feels like a change from SAS gives Stata a try.

You might find that you end up with the statistician’s

equivalent of a delicious bunch of bananas without being

sprayed with cold water.

Adam Jacobs

Dianthus Medical Limited, UK

E-mail: [email protected]

(DOI: 10.1002/pst. 194)

Copyright # 2005 John Wiley & Sons, Ltd. Pharmaceut. Statist. 2005; 4: 297–300

Book reviews300