Regression Analysis of Count Data, Second...

27
Regression Analysis of Count Data, Second Edition Students in both social and natural sciences often seek regression methods to explain the frequency of events, such as visits to a doctor, auto accidents, or new patents awarded. This book provides the most comprehensive and up-to-date account of models and methods to interpret such data. The authors have conducted research in the field for more than 25 years. In this book, they combine theory and practice to make sophisticated methods of analysis accessible to researchers and practitioners working with widely different types of data and software in areas such as applied statistics, econometrics, marketing, operations research, actuarial studies, demography, biostatistics, and quantitative social sciences. The book may be used as a reference work on count models or by students seeking an authoritative overview. Complementary material in the form of data sets, template programs, and bibliographic resources can be accessed on the Internet through the authors’ homepages. This second edition is an expanded and updated version of the first, with new empirical examples and more than two hundred new references added. The new material includes new theoretical topics, an updated and expanded treatment of cross-section models, coverage of bootstrap-based and simulation-based inference, expanded treatment of time series, multivariate and panel data, expanded treatment of endogenous regressors, coverage of quantile count regression, and a new chapter on Bayesian methods. A. Colin Cameron is Professor of Economics at the University of California, Davis. His research and teaching interests span a range of topics in microeconometrics. He is a past director of the Center on Quantitative Social Science at UC Davis and is currently an associate editor of the Stata Journal and Journal of Econometric Methods. Pravin K. Trivedi is Professor of Economics, University of Queensland, and Dis- tinguished Professor Emeritus and J. H. Rudy Professor Emeritus of Economics at Indiana University, Bloomington. During his academic career, he has taught undergraduate and graduate-level econometrics in the United States, Europe, and Australia. His research interests are in microeconometrics and health economics. He served as coeditor of the Econometrics Journal from 2000 to 2007 and has been on the board of Journal of Applied Econometrics since 1988. Professors Cameron and Trivedi are coauthors of the first edition of Regression Analysis of Count Data (Cambridge University Press, 1998), Microeconometrics: Methods and Applications (Cambridge University Press, 2005), and Microeco- nomics Using Stata Revised Edition (2010). www.cambridge.org © in this web service Cambridge University Press Cambridge University Press 978-1-107-01416-9 - Regression Analysis of Count Data: Second Edition A. Colin Cameron and Pravin K. Trivedi Frontmatter More information

Transcript of Regression Analysis of Count Data, Second...

Regression Analysis of Count Data, Second Edition

Students in both social and natural sciences often seek regression methods toexplain the frequency of events, such as visits to a doctor, auto accidents, ornew patents awarded. This book provides the most comprehensive and up-to-dateaccount of models and methods to interpret such data. The authors have conductedresearch in the field for more than 25 years. In this book, they combine theoryand practice to make sophisticated methods of analysis accessible to researchersand practitioners working with widely different types of data and software in areassuch as applied statistics, econometrics, marketing, operations research, actuarialstudies, demography, biostatistics, and quantitative social sciences. The book maybe used as a reference work on count models or by students seeking an authoritativeoverview. Complementary material in the form of data sets, template programs,and bibliographic resources can be accessed on the Internet through the authors’homepages. This second edition is an expanded and updated version of the first, withnew empirical examples and more than two hundred new references added. Thenew material includes new theoretical topics, an updated and expanded treatment ofcross-section models, coverage of bootstrap-based and simulation-based inference,expanded treatment of time series, multivariate and panel data, expanded treatmentof endogenous regressors, coverage of quantile count regression, and a new chapteron Bayesian methods.

A. Colin Cameron is Professor of Economics at the University of California, Davis.His research and teaching interests span a range of topics in microeconometrics.He is a past director of the Center on Quantitative Social Science at UC Davis andis currently an associate editor of the Stata Journal and Journal of EconometricMethods.

Pravin K. Trivedi is Professor of Economics, University of Queensland, and Dis-tinguished Professor Emeritus and J. H. Rudy Professor Emeritus of Economicsat Indiana University, Bloomington. During his academic career, he has taughtundergraduate and graduate-level econometrics in the United States, Europe, andAustralia. His research interests are in microeconometrics and health economics.He served as coeditor of the Econometrics Journal from 2000 to 2007 and has beenon the board of Journal of Applied Econometrics since 1988.

Professors Cameron and Trivedi are coauthors of the first edition of RegressionAnalysis of Count Data (Cambridge University Press, 1998), Microeconometrics:Methods and Applications (Cambridge University Press, 2005), and Microeco-nomics Using Stata Revised Edition (2010).

www.cambridge.org© in this web service Cambridge University Press

Cambridge University Press978-1-107-01416-9 - Regression Analysis of Count Data: Second EditionA. Colin Cameron and Pravin K. TrivediFrontmatterMore information

www.cambridge.org© in this web service Cambridge University Press

Cambridge University Press978-1-107-01416-9 - Regression Analysis of Count Data: Second EditionA. Colin Cameron and Pravin K. TrivediFrontmatterMore information

Econometric Society Monographs

Editors:

Rosa L. Matzkin, University of California, Los AngelesGeorge J. Mailath, University of Pennsylvania

The Econometric Society is an international society for the advancement of economic theory inrelation to statistics and mathematics. The Econometric Society Monograph series is designedto promote the publication of original research contributions of high quality in mathematicaleconomics and theoretical and applied econometrics.

Other titles in the series:

G. S. Maddala, Limited dependent and qualitative variables in econometrics, 9780521241434,9780521338257

Gerard Debreu, Mathematical economics: Twenty papers of Gerard Debreu, 9780521237369,9780521335614

Jean-Michel Grandmont, Money and value: A reconsideration of classical and neoclassicalmonetary economics, 9780521251419, 9780521313643

Franklin M. Fisher, Disequilibrium foundations of equilibrium economics, 9780521378567Andreu Mas-Colell, The theory of general equilibrium: A differentiable approach,

9780521265140, 9780521388702Truman F. Bewley, Editor, Advances in econometrics – Fifth World Congress (Volume I),

9780521467261Truman F. Bewley, Editor, Advances in econometrics – Fifth World Congress (Volume II),

9780521467254Herve Moulin, Axioms of cooperative decision making, 9780521360555, 9780521424585L. G. Godfrey, Misspecification tests in econometrics: The Lagrange multiplier principle and

other approaches, 9780521424592Tony Lancaster, The econometric analysis of transition data, 9780521437899Alvin E. Roth and Marilda A. Oliviera Sotomayor, Editors, Two-sided matching: A study in

game-theoretic modeling and analysis, 9780521437882Wolfgang Hardle, Applied nonparametric regression, 9780521429504Jean-Jacques Laffont, Editor, Advances in economic theory – Sixth World Congress (Volume I),

9780521484596Jean-Jacques Laffont, Editor, Advances in economic theory – Sixth World Congress (Volume II),

9780521484602Halbert White, Estimation, inference and specification, 9780521252805, 9780521574464Christopher Sims, Editor, Advances in econometrics – Sixth World Congress (Volume I),

9780521444590, 9780521566100Christopher Sims, Editor, Advances in econometrics – Sixth World Congress (Volume II),

9780521444606, 9780521566094Roger Guesnerie, A contribution to the pure theory of taxation, 9780521629560David M. Kreps and Kenneth F. Wallis, Editors, Advances in economics and econometrics –

Seventh World Congress (Volume I), 9780521589833David M. Kreps and Kenneth F. Wallis, Editors, Advances in economics and econometrics –

Seventh World Congress (Volume II), 9780521589826David M. Kreps and Kenneth F. Wallis, Editors, Advances in economics and econometrics –

Seventh World Congress (Volume III), 9780521580137, 9780521589819Donald P. Jacobs, Ehud Kalai, and Morton I. Kamien, Editors, Frontiers of research in economic

theory: The Nancy L. Schwartz Memorial Lectures, 1983–1997, 9780521632225,9780521635387

Steinar Strom, Editor, Econometrics and economic theory in the 20th century: The Ragnar FrischCentennial Symposium, 9780521633239, 9780521633659

Continued on page following the index

www.cambridge.org© in this web service Cambridge University Press

Cambridge University Press978-1-107-01416-9 - Regression Analysis of Count Data: Second EditionA. Colin Cameron and Pravin K. TrivediFrontmatterMore information

www.cambridge.org© in this web service Cambridge University Press

Cambridge University Press978-1-107-01416-9 - Regression Analysis of Count Data: Second EditionA. Colin Cameron and Pravin K. TrivediFrontmatterMore information

Regression Analysis of Count Data

Second Edition

A. Colin CameronUniversity of California, Davis

Pravin K. TrivediUniversity of Queensland

Indiana University, Bloomington, Emeritus

www.cambridge.org© in this web service Cambridge University Press

Cambridge University Press978-1-107-01416-9 - Regression Analysis of Count Data: Second EditionA. Colin Cameron and Pravin K. TrivediFrontmatterMore information

32 Avenue of the Americas, New York NY 10013-2473, USA

Cambridge University Press is part of the University of Cambridge.

It furthers the University’s mission by disseminating knowledge in the pursuit of

education, learning, and research at the highest international levels of excellence.

www.cambridge.org

Information on this title: www.cambridge.org/9781107667273

© A. Colin Cameron and Pravin K. Trivedi 1998, 2013

This publication is in copyright. Subject to statutory exception

and to the provisions of relevant collective licensing agreements,

no reproduction of any part may take place without the written

permission of Cambridge University Press.

First edition published 1998

Second edition published 2013

Reprinted 2013

A catalog record for this publication is available from the British Library.

Library of Congress Cataloging in Publication dataCameron, Adrian Colin.

Regression analysis of count data / A. Colin Cameron, University of California, Davis,

Pravin K. Trivedi, Indiana University, Bloomington. – Second edition.

pages cm. – (Econometric society monographs)

Includes bibliographical references and index.

ISBN 978-1-107-01416-9 (hardback) – ISBN 978-1-107-66727-3 (paperback)

1. Regression analysis. 2. Econometrics. I. Trivedi, P. K. II. Title.

QA278.2.C36 2013

519.5´36–dc23 2012043350

ISBN 978-1-107-01416-9 Hardback

ISBN 978-1-107-66727-3 Paperback

Cambridge University Press has no responsibility for the persistence or accuracy of

URLs for external or third-party Internet Web sites referred to in this publication,

and does not guarantee that any content on such Web sites is, or will remain, accurate

or appropriate.

www.cambridge.org© in this web service Cambridge University Press

Cambridge University Press978-1-107-01416-9 - Regression Analysis of Count Data: Second EditionA. Colin Cameron and Pravin K. TrivediFrontmatterMore information

To Michelle and Bhavna

www.cambridge.org© in this web service Cambridge University Press

Cambridge University Press978-1-107-01416-9 - Regression Analysis of Count Data: Second EditionA. Colin Cameron and Pravin K. TrivediFrontmatterMore information

www.cambridge.org© in this web service Cambridge University Press

Cambridge University Press978-1-107-01416-9 - Regression Analysis of Count Data: Second EditionA. Colin Cameron and Pravin K. TrivediFrontmatterMore information

Contents

List of Figures page xv

List of Tables xvii

Preface xxi

Preface to the First Edition xxv

1 Introduction 11.1 Poisson Distribution and Its Characterizations 31.2 Poisson Regression 91.3 Examples 111.4 Overview of Major Issues 171.5 Bibliographic Notes 20

2 Model Specification and Estimation 212.1 Introduction 212.2 Example and Definitions 222.3 Likelihood-Based Models 242.4 Generalized Linear Models 292.5 Moment-Based Models 402.6 Testing 482.7 Robust Inference 582.8 Derivation of Results 612.9 Bibliographic Notes 672.10 Exercises 67

3 Basic Count Regression 693.1 Introduction 693.2 Poisson MLE, QMLE, and GLM 713.3 Negative Binomial MLE and QGPMLE 803.4 Overdispersion Tests 893.5 Use of Regression Results 92

ix

www.cambridge.org© in this web service Cambridge University Press

Cambridge University Press978-1-107-01416-9 - Regression Analysis of Count Data: Second EditionA. Colin Cameron and Pravin K. TrivediFrontmatterMore information

x Contents

3.6 Ordered and Other Discrete-Outcome Models 983.7 Other Models 1023.8 Iteratively Reweighted Least Squares 1083.9 Bibliographic Notes 1083.10 Exercises 109

4 Generalized Count Regression 1114.1 Introduction 1114.2 Mixture Models 1124.3 Truncated Counts 1284.4 Censored Counts 1334.5 Hurdle Models 1364.6 Zero-Inflated Count Models 1394.7 Hierarchical Models 1424.8 Finite Mixtures and Latent Class Analysis 1444.9 Count Models with Cross-Sectional Dependence 1564.10 Models Based on Waiting Time Distributions 1614.11 Katz, Double Poisson, and Generalized Poisson 1674.12 Derivations 1714.13 Bibliographic Notes 1744.14 Exercises 175

5 Model Evaluation and Testing 1775.1 Introduction 1775.2 Residual Analysis 1785.3 Goodness of Fit 1885.4 Discriminating among Nonnested Models 1965.5 Tests for Overdispersion 2005.6 Conditional Moment Specification Tests 2075.7 Derivations 2205.8 Bibliographic Notes 2225.9 Exercises 223

6 Empirical Illustrations 2256.1 Introduction 2256.2 Background 2266.3 Analysis of Demand for Health Care 2286.4 Analysis of Recreational Trips 2456.5 Analysis of Fertility Data 2536.6 Model Selection Criteria: A Digression 2576.7 Concluding Remarks 2606.8 Bibliographic Notes 2606.9 Exercises 261

www.cambridge.org© in this web service Cambridge University Press

Cambridge University Press978-1-107-01416-9 - Regression Analysis of Count Data: Second EditionA. Colin Cameron and Pravin K. TrivediFrontmatterMore information

Contents xi

7 Time Series Data 2637.1 Introduction 2637.2 Models for Time Series Data 2647.3 Static Count Regression 2687.4 Serially Correlated Heterogeneity Models 2767.5 Autoregressive Models 2817.6 Integer-Valued ARMA Models 2847.7 State Space Models 2897.8 Hidden Markov Models 2917.9 Dynamic Ordered Probit Model 2937.10 Discrete ARMA Models 2947.11 Applications 2957.12 Derivations 3017.13 Bibliographic Notes 3027.14 Exercises 302

8 Multivariate Data 3048.1 Introduction 3048.2 Characterizing and Generating Dependence 3058.3 Sources of Dependence 3108.4 Multivariate Count Models 3118.5 Copula-Based Models 3178.6 Moment-Based Estimation 3258.7 Testing for Dependence 3278.8 Mixed Multivariate Models 3338.9 Empirical Example 3368.10 Derivations 3388.11 Bibliographic Notes 339

9 Longitudinal Data 3419.1 Introduction 3419.2 Models for Longitudinal Data 3429.3 Population Averaged Models 3499.4 Fixed Effects Models 3519.5 Random Effects Models 3609.6 Discussion 3649.7 Specification Tests 3669.8 Dynamic Longitudinal Models 3689.9 Endogenous Regressors 3789.10 More Flexible Functional Forms for Longitudinal Data 3799.11 Derivations 3819.12 Bibliographic Notes 3839.13 Exercises 384

www.cambridge.org© in this web service Cambridge University Press

Cambridge University Press978-1-107-01416-9 - Regression Analysis of Count Data: Second EditionA. Colin Cameron and Pravin K. TrivediFrontmatterMore information

xii Contents

10 Endogenous Regressors and Selection 38510.1 Introduction 38510.2 Endogeneity in Recursive Models 38610.3 Selection Models for Counts 38810.4 Moment-Based Methods for Endogenous Regressors 39710.5 Example: Doctor Visits and Health Insurance 40210.6 Selection and Endogeneity in Two-Part Models 40610.7 Alternative Sampling Frames 40710.8 Bibliographic Notes 412

11 Flexible Methods for Counts 41311.1 Introduction 41311.2 Flexible Distributions Using Series Expansions 41411.3 Flexible Models of the Conditional Mean 42111.4 Flexible Models of the Conditional Variance 42511.5 Quantile Regression for Counts 43211.6 Nonparametric Methods 43511.7 Efficient Moment-Based Estimation 43811.8 Analysis of Patent Counts 44211.9 Derivations 44611.10 Bibliographic Notes 447

12 Bayesian Methods for Counts 44912.1 Introduction 44912.2 Bayesian Approach 44912.3 Poisson Regression 45312.4 Markov Chain Monte Carlo Methods 45412.5 Count Models 46012.6 Roy Model for Counts 46412.7 Bibliographic Notes 467

13 Measurement Errors 46813.1 Introduction 46813.2 Measurement Errors in Regressors 46913.3 Measurement Errors in Exposure 47913.4 Measurement Errors in Counts 48513.5 Underreported Counts 48813.6 Underreported and Overrereported Counts 49413.7 Simulation Example: Poisson with Mismeasured Regressor 49613.8 Derivations 49813.9 Bibliographic Notes 49913.10 Exercises 499

www.cambridge.org© in this web service Cambridge University Press

Cambridge University Press978-1-107-01416-9 - Regression Analysis of Count Data: Second EditionA. Colin Cameron and Pravin K. TrivediFrontmatterMore information

Contents xiii

A Notation and Acronyms 501

B Functions, Distributions, and Moments 505B.1 Gamma Function 505B.2 Some Distributions 506B.3 Moments of Truncated Poisson 507

C Software 509

References 511

Author Index 543

Subject Index 553

www.cambridge.org© in this web service Cambridge University Press

Cambridge University Press978-1-107-01416-9 - Regression Analysis of Count Data: Second EditionA. Colin Cameron and Pravin K. TrivediFrontmatterMore information

www.cambridge.org© in this web service Cambridge University Press

Cambridge University Press978-1-107-01416-9 - Regression Analysis of Count Data: Second EditionA. Colin Cameron and Pravin K. TrivediFrontmatterMore information

List of Figures

1.1 Frequency distributions of counts for four types of events:Doctor visits, generated Poisson data, recreational trips, andnumber of children page 12

1.2 Daily data on the number of hospital admissions for asthma 143.1 Doctor visits: Poisson and NB2 predicted probabilities 874.1 Two Crossing Theorem: Negative binomial compared with the

Poisson 1174.2 Two-component mixture of Poissons 1465.1 Takeover bids: Residual plots 1876.1 Histograms for the four count variables 2266.2 OFP visits: Directional gradients 2376.3 Comparison of the fitted means for the two latent classes 2396.4 Component densities from the FMNB-2 NB1 model 2397.1 Strikes (left axis) and output (right axis) per month 2737.2 Actual and predicted strikes from a static regression model 2747.3 Strikes: Actual and predicted strikes from a dynamic

regression model 2977.4 Histogram and time series of the number of trades 2988.1 Samples from Clayton and Gumbel copulas with Poisson

marginals 32011.1 Fitted values from four models plotted against actual value 44712.1 The first 100 retained draws (first panel) and the first 2,000

retained draws (second panel) of β2 from the Metropolisalgorithm 457

12.2 Estimate of posterior density of β2 from 10,000 retained drawsof β2 from the Metropolis algorithm 458

xv

www.cambridge.org© in this web service Cambridge University Press

Cambridge University Press978-1-107-01416-9 - Regression Analysis of Count Data: Second EditionA. Colin Cameron and Pravin K. TrivediFrontmatterMore information

www.cambridge.org© in this web service Cambridge University Press

Cambridge University Press978-1-107-01416-9 - Regression Analysis of Count Data: Second EditionA. Colin Cameron and Pravin K. TrivediFrontmatterMore information

List of Tables

1.1 Joint frequency distribution of emergency room visits andhospitalizations page 17

3.1 Doctor visits: Actual frequency distribution 783.2 Doctor visits: Variable definitions and summary statistics 783.3 Doctor visits: Poisson QMLE with different standard error

estimates 793.4 Doctor visits: NB2 and NB1 model estimators and standard

errors 863.5 Simulation: Estimators under alternative dgps 883.6 Doctor visits: Poisson QMLE mean effects and scaled

coefficients 963.7 Doctor visits: Alternative estimates and t ratios 1064.1 Selected mixture models for count data 1164.2 Patents: Frequencies for 1979 data 1554.3 Patents: Poisson, NB2, and FMNB2(2) models with 1979 data 1555.1 Takeover bids: Actual frequency distribution 1845.2 Takeover bids: Variable definitions and summary statistics 1855.3 Takeover bids: Poisson QMLE with robust sandwich standard

errors and t ratios 1855.4 Takeover bids: Descriptive statistics for various residuals 1865.5 Takeover bids: Correlations of various residuals 1865.6 Takeover bids: Poisson MLE predicted and actual probabilities 1965.7 Takeover bids: Information criteria for alternative parametric

models 2006.1 OFP visits: Actual frequency distribution (n = 4,406) 2306.2 OFP visits: Variable definitions and summary statistics 2306.3 OFP visits: Likelihood ratio tests 2346.4 OFP visits: Information criteria (AIC and BIC) 235

xvii

www.cambridge.org© in this web service Cambridge University Press

Cambridge University Press978-1-107-01416-9 - Regression Analysis of Count Data: Second EditionA. Colin Cameron and Pravin K. TrivediFrontmatterMore information

xviii List of Tables

6.5 OFP visits: FMNB-2 NB1 model, actual and fitteddistributions, and goodness-of-fit test 238

6.6 OFP visits: FMNB-2 NB1 model fitted means and variances 2386.7 OFP visits: FMNB-2 NB1 model estimates and t-ratios 2416.8 OFP visits: NB2 hurdle model estimates and t-ratios 2436.9 Recreational trips: Actual frequency distribution 2466.10 Recreational trips: Variable definitions and summary statistics 2466.11 Recreational trips: Poisson and NB2 model estimates and

t-ratios 2496.12 Recreational trips: Finite mixture model estimates and t-ratios 2506.13 Recreational trips: With zeros and hurdle model estimates and

t-ratios 2526.14 Recreational trips: Actual and fitted cumulative frequencies 2526.15 Number of children: Actual and fitted frequency distribution

for Swiss data (n = 1,878) 2546.16 Number of children: Actual and fitted frequency distribution

for British data (n = 6,782) 2556.17 Swiss fertility data: Fit of various models 2556.18 Rejection frequencies at nominal 10% significance level 2587.1 Strikes: Variable definitions and summary statistics 2737.2 Strikes: Poisson QMLE with heteroskedastic and

autocorrelation robust standard errors 2747.3 Strikes: Residuals autocorrelations and serial correlation tests 2757.4 Strikes: Actual and fitted frequency distributions of Poisson

and NB2 regressions 2767.5 Strikes: Zeger-Qaqish autoregressive model estimates and

diagnostics 2967.6 Stock trades: Static and dynamic model estimates 3008.1 Copulas: Five leading examples 3198.2 Orthogonal polynomials: First and second order 3338.3 ML estimates of bivariate negative binomial model and

NLSUR estimates 3379.1 Patents awarded: Actual frequency distribution 3589.2 Patents: Pooled and fixed effects estimates 3599.3 Patents: Random effects estimates 3649.4 Moment conditions for selected panel count models 3749.5 Patents: Dynamic model pooled and random effects estimates 3779.6 Patents: Dynamic model correlated effects estimates 37710.1 Doctor visits: One endogenous regressor 40410.2 Doctor visits: Two endogenous regressors 405

www.cambridge.org© in this web service Cambridge University Press

Cambridge University Press978-1-107-01416-9 - Regression Analysis of Count Data: Second EditionA. Colin Cameron and Pravin K. TrivediFrontmatterMore information

List of Tables xix

11.1 Patents: Poisson, NB2, FMNB2(2), and NB2P1 models with1979 data 443

11.2 Patents: Actual versus nonparametric prediction 44411.3 Patents: Summary of various fitted means 44511.4 Patents: Correlations of various fitted means 44612.1 Metropolis Poisson example: Posterior summary and ML

estimates 45813.1 Simulation exercise: Poisson regression with one mismeasured

regressor 497

www.cambridge.org© in this web service Cambridge University Press

Cambridge University Press978-1-107-01416-9 - Regression Analysis of Count Data: Second EditionA. Colin Cameron and Pravin K. TrivediFrontmatterMore information

www.cambridge.org© in this web service Cambridge University Press

Cambridge University Press978-1-107-01416-9 - Regression Analysis of Count Data: Second EditionA. Colin Cameron and Pravin K. TrivediFrontmatterMore information

Preface

Since Regression Analysis of Count Data was published in 1998, significantnew research has contributed to the range and scope of count data models. Thisgrowth is reflected in many new journal articles, fuller coverage in textbooks,and wide interest in and availability of software for handling count data models.These developments (to which we have also contributed) have motivated us torevise and expand the first edition. Like the first edition, this volume reflects anorientation toward practical data analysis.

The revisions in this edition have affected all chapters. First, we have cor-rected the typographical and other errors in the first edition, improved the graph-ics throughout, and where appropriate we have provided a cleaner and simplerexposition. Second, we have revised and relocated material that seemed betterplaced in a different location, mostly within the same chapter though occasion-ally in a different chapter. For example, material in Chapter 4 (generalized countmodels), Chapter 8 (multivariate counts), and Chapter 13 (measurement errors)has been pruned and rearranged so the more mainstream topics appear earlierand the more marginal topics have disappeared altogether. For similar reasonsbootstrap inference has moved to Chapter 2 from Chapter 5. Our goal here hasbeen to improve quality of synthesis and accessibility of material to the reader.Third, the final few chapters have been reordered. Chapter 10 (endogeneityand selection) has moved up from Chapter 11. It replaces the measurementerror chapter that now appears as Chapter 13. Chapter 11 now covers flexibleparametric models (previously Chapter 12). And the current Chapter 12, whichcovers Bayesian methods, is a new addition. Fourth, we have removed materialthat was of marginal interest and replaced it with material of potentially greaterinterest, especially to practitioners. For example, as barriers to implementa-tion of more computer-intensive methods have come down, we have liberallysprinkled illustrations of simulation-based methods throughout the book. Fifth,bibliographic notes at the end of every chapter have been refreshed to includenewer references and topics. Sixth, we have developed an almost complete setof computer code for the examples in this book.

The first edition has been expanded by about 35%. This expansion reflectsthe addition of a new Chapter 12 on Bayesian methods as well as significant

xxi

www.cambridge.org© in this web service Cambridge University Press

Cambridge University Press978-1-107-01416-9 - Regression Analysis of Count Data: Second EditionA. Colin Cameron and Pravin K. TrivediFrontmatterMore information

xxii Preface

additions to most other chapters. Chapter 2 has new sections on robust inferenceand empirical likelihood and includes material on the bootstrap and general-ized estimating equations. In Chapter 3 and throughout the book, the term“pseudo-ML” has been changed to “quasi-ML” and robust standard errors arecomputed using the robust sandwich form. Chapter 4 improves the coverageand discussion of how many alternative count models relate to each other. Cen-sored, truncated, hurdle, zero-inflated, and especially finite mixture models arenow covered in greater depth, with a more uniform notation, and hierarchicalcount models and models with cross-sectional and spatial dependence havebeen newly added. Chapter 5 moves up presentation of methods for discrim-ination among nonnested models. Chapter 6 adds a new empirical exampleof fertility data that poses a fresh challenge to count data modelers. The timeseries coverage in Chapter 7 has been expanded to include more recently devel-oped models, and there is some rearrangement so that the most often usedmodels appear first. The coverage of multivariate count models in Chapter 8uses a broader and more modern range of dependence concepts and provides alengthy treatment of parametric copula-based models. The survey of count datapanel models in Chapter 9 gives greater emphasis to moment-based approachesand has a more comprehensive coverage of dynamic panels, the role of initialconditions, conditionally correlated random effects, flexible functional forms,and specification tests. Chapter 10 provides an improved exposition of modelswith endogeneity and selection, including consideration of latent factor andtwo-part models as well as simulation-based inference and control functionestimators. A major new topic in Chapter 11 is quantile regression models forcount data, and the coverage of semiparametric and nonparametric methodshas been expanded and updated. As previously mentioned, the new Chapter12 covers Bayesian analysis of count models, providing an entry to the worldof Markov chain Monte Carlo analysis of count models. Finally, Chapter 13provides a comprehensive survey of measurement error models for count data.As a result of the expanded coverage of old topics and appearance of newones, the bibliography is now significantly larger and includes more thantwo hundred additional new references.

To emphasize its empirical orientation, the book has added many new exam-ples based on real data. These examples are scattered throughout the book,especially in Chapters 6–12. In addition, we have a number of examples basedon simulated data. Researchers, instructors, and students interested in replicat-ing our results can obtain all the data and computer programs used to producethe results given in this book via Internet from our respective personal websites.

This revised and expanded second edition draws extensively from our jointlyauthored research undertaken with Partha Deb, Jie Qun Guo, Judex Hyppolite,Tong Li, Doug Miller, Murat Munkin, and David Zimmer. We thank them all.We also thank Joao Santos Silva for detailed comments on Chapter 10 and JeffRacine for detailed comments on Chapter 11. The series editor Rosa Matzkinand an anonymous reviewer provided helpful guidance and suggestions for

www.cambridge.org© in this web service Cambridge University Press

Cambridge University Press978-1-107-01416-9 - Regression Analysis of Count Data: Second EditionA. Colin Cameron and Pravin K. TrivediFrontmatterMore information

Preface xxiii

improvements for which we are grateful. As for the first edition, it is a pleasureto acknowledge the overall editorial direction and encouragement of Scott Parrisof Cambridge University Press throughout the multiyear process of bringingthe project to completion.

A. Colin CameronDavis, CA

Pravin K. TrivediBloomington, IN

August 2012

www.cambridge.org© in this web service Cambridge University Press

Cambridge University Press978-1-107-01416-9 - Regression Analysis of Count Data: Second EditionA. Colin Cameron and Pravin K. TrivediFrontmatterMore information

www.cambridge.org© in this web service Cambridge University Press

Cambridge University Press978-1-107-01416-9 - Regression Analysis of Count Data: Second EditionA. Colin Cameron and Pravin K. TrivediFrontmatterMore information

Preface to the First Edition

This book describes regression methods for count data, where the responsevariable is a non-negative integer. The methods are relevant for analysis ofcounts that arise in both social and natural sciences.

Despite their relatively recent origin, count data regression methods buildon an impressive body of statistical research on univariate discrete distribu-tions. Many of these methods have now found their way into major statisticalpackages, which has encouraged their application in a variety of contexts. Suchwidespread use has itself thrown up numerous interesting research issues andthemes, which we explore in this book.

The objective of the book is threefold. First, we wish to provide a synthe-sis and integrative survey of the literature on count data regressions, coveringboth the statistical and econometric strands. The former has emphasized theframework of generalized linear models, exponential families of distributions,and generalized estimating equations, while the latter has emphasized nonlin-ear regression and generalized method of moment frameworks. Yet betweenthem there are numerous points of contact which can be fruitfully exploited.Our second objective is to make sophisticated methods of data analysis moreaccessible to practitioners with different interests and backgrounds. To this endwe consider models and methods suitable for cross-section, time series, andlongitudinal data. Detailed analyses of several data sets as well as shorter illus-trations, implemented from a variety of viewpoints, are scattered throughoutthe book to put empirical flesh on theoretical or methodological discussion.We draw on examples from, and give references to, works in many appliedareas. Our third objective is to highlight the potential for further research bydiscussion of issues and problems that need more analysis. We do so by embed-ding count data models in a larger body of econometric and statistical work ondiscrete variables and, more generally, on nonlinear regression.

The book can be divided into four parts. The first two chapters containintroductory material on count data and a comprehensive review of statisticalmethods for nonlinear regression models. Chapters 3, 4, 5, and 6 present modelsand applications for cross-section count data. Chapters 7, 8, and 9 presentmethods for data other than cross-section data, namely time series, multivariate

xxv

www.cambridge.org© in this web service Cambridge University Press

Cambridge University Press978-1-107-01416-9 - Regression Analysis of Count Data: Second EditionA. Colin Cameron and Pravin K. TrivediFrontmatterMore information

xxvi Preface to the First Edition

and longitudinal or panel data. Chapters 10, 11, and 12 present methods forcommon complications, including measurement error, sample selection andsimultaneity, and semiparametric methods. Thus the coverage of the book isqualitatively similar to that in a complete single book on linear regressionmodels.

The book is directed toward researchers, graduate students, and other prac-titioners in a wide range of fields. Because of our own background in econo-metrics, the book emphasizes issues arising in econometric applications. Ourtraining and background also influence the organizational structure of the book.But areas outside econometrics are also considered. The essential prerequisitefor this book is familiarity with the linear regression model using matrix algebra.The material in the book should be accessible to people with a background inregression and statistical methods up to the level of a standard first-year grad-uate econometrics text such as Greene’s Econometric Analysis. While basiccount data methods are included in major statistical packages, more advancedanalysis can require programming in languages such as Splus, Gauss, or Matlab.

Our own entry into the field of count data models dates back to the early1980s when we embarked on an empirical study of the demand for health insur-ance and health care services at the Australian National University. Since thenwe have been involved in many empirical investigations that have influencedour perceptions of this field. We have included numerous data analytic discus-sions in this volume to reflect our own interest and those of readers interested inreal data applications. The data sets, computer programs, and related materialsused in this book will be available through Internet access to our individualweb sites. These materials supplement and complement this book and will helpnew entrants to the field, especially graduate students, to make a relatively easystart.

We have learned much on modeling count data through collaborationswith coauthors, notably Partha Deb, Shiferaw Gurmu, Per Johansson, KajalMukhopadhyay, and Frank Windmeijer. The burden of writing this book hasbeen eased by help from many colleagues, coauthors, and graduate students. Inparticular, we thank the following for their generous attention, encouragement,help, and comments on earlier drafts of various chapters: Kurt Brannas, DavidHendry, Primula Kennedy, Tony Lancaster, Scott Long, Grayham Mizon, NeilShephard, and Bob Shumway, in addition to the coauthors already mentioned.We especially thank David Hendry and Scott Long for their detailed adviceon manuscript preparation using Latex software and Scientific Workplace. Themanuscript has also benefited from the comments of a referee and the serieseditor, Alberto Holly, and from the guidance of Scott Parris of CambridgeUniversity Press.

Work on the book was facilitated by periods spent at various institutions.The first author thanks the Department of Statistics and the Research School ofSocial Sciences at the Australian National University, the Department of Eco-nomics at Indiana University in Bloomington, and the University of Californiain Davis for support during extended leaves to these institutions in 1995 and

www.cambridge.org© in this web service Cambridge University Press

Cambridge University Press978-1-107-01416-9 - Regression Analysis of Count Data: Second EditionA. Colin Cameron and Pravin K. TrivediFrontmatterMore information

Preface to the First Edition xxvii

1996. The second author thanks Indiana University and European UniversityInstitute, Florence, for support during his tenure as Jean Monnet Fellow in1996, which permitted a period away from regular duties. For shorter periodsof stay that allowed us to work jointly, we thank the Department of Economicsat Indiana University, SELAPO at University of Munich, and the EuropeanUniversity Institute.

Finally we would both like to thank our families for their patience andforbearance, especially during the periods of intensive work on the book. Thiswork would not have been possible at all without their constant support.

A. Colin CameronDavis, CA

Pravin K. TrivediBloomington, IN

www.cambridge.org© in this web service Cambridge University Press

Cambridge University Press978-1-107-01416-9 - Regression Analysis of Count Data: Second EditionA. Colin Cameron and Pravin K. TrivediFrontmatterMore information