Subtyping of juvenile idiopathic arthritis using latent class analysis

8
ARTHRITIS & RHEUMATISM Vol. 43, No. 7, July 2000, pp 1496–1503 © 2000, American College of Rheumatology SUBTYPING OF JUVENILE IDIOPATHIC ARTHRITIS USING LATENT CLASS ANALYSIS ELAINE THOMAS, JENNIFER H. BARRETT, RACHELLE P. DONN, WENDY THOMSON, TAUNTON R. SOUTHWOOD, and the BRITISH PAEDIATRIC RHEUMATOLOGY GROUP Objective. To use statistical techniques to identify underlying subtypes of juvenile idiopathic arthritis (JIA) that best explain the observed relationships of clinical and laboratory variables, and to compare the statisti- cally derived subtypes with those defined by the Inter- national League of Associations for Rheumatology (ILAR) criteria and examine them for HLA associations. Methods. Information on 572 patients diagnosed as having JIA was summarized by 10 clinical and laboratory categorical variables (age at onset, large joint involvement, small joint involvement, polyarthri- tis, symmetric arthritis, spinal pain, fever, psoriasis, antinuclear antibodies [ANA], and rheumatoid factor). Latent class analysis (LCA) was used to identify under- lying (“latent”) classes that explained the relationships among the observed variables. Statistical models incor- porating 5–8 latent classes were applied to the data. Results. The 7-class model was the most appro- priate. Patterns of joint involvement and the presence of ANA were influential in determining latent classes. There was some correspondence between the latent classes and the ILAR categories, but they did not coincide completely. Significant differences between the latent classes were seen for 3 HLA haplotypes (DRB1*04-DQA1*03-DQB1*03, DRB1*13-DQA1*01- DQB1*06, and DRB1*08-DQA1*0401-DQB1*0402). Conclusion. LCA provides a novel approach to the task of identifying homogeneous subtypes within the umbrella of JIA. In further work, the identified latent classes will be examined for associations with other candidate genes and for differences in outcome. Juvenile idiopathic arthritis (JIA), previously termed juvenile chronic arthritis (JCA) in Europe (1) and juvenile rheumatoid arthritis in North America (2), is the most common rheumatic disease experienced in childhood. Based on data from the UK, the estimated annual incidence of JCA is 10 per 100,000 children (3), and the prevalence is 65 per 100,000 children (4). The term JIA indicates diseases of childhood onset charac- terized primarily by arthritis persisting for at least 6 weeks and currently having no known cause. JIA is complex, with both environmental and genetic risk fac- tors; in particular, associations with certain HLA alleles have been established (5). It encompasses a spectrum of diseases that vary widely in onset characteristics, clinical course, associated manifestations, and ultimate out- comes. There are no diagnostic tests available for JIA; categories are based on clinical features and are highly dependent on the classification system used. Children with persistent arthritis have previously been subdivided into 3 groups (polyarticular onset, pauciarticular onset, and systemic onset) on the basis of the number of joints affected at disease onset and a variety of extraarticular clinical features. More recently, a variety of classification systems have been proposed, including the European League Against Rheumatism criteria (1) and the American College of Rheumatology criteria (2), with additional criteria for juvenile spon- dylarthropathies (6) and psoriatic arthritis (7,8). The most recent proposal, the International League of As- sociations for Rheumatology (ILAR) classification, was formulated by consensus among pediatric rheumatolo- gist representatives of the Leagues Against Rheumatism in 1995 (9) and modified in 1997 (10). The proposed classification was intended to begin a process of valida- tion and modification which would eventually result in a greater understanding of persistent arthritis that begins Supported by the Arthritis Research Campaign (ARC), UK. Elaine Thomas, PhD, Jennifer H. Barrett, PhD, Rachelle P. Donn, PhD, Wendy Thomson, PhD: ARC Epidemiology Unit, Uni- versity of Manchester, Manchester, UK; Taunton R. Southwood, FRCP: University of Birmingham, Birmingham, UK. Address reprint requests to Elaine Thomas, PhD, Industrial and Community Health Research Centre, School of Postgraduate Medicine, Keele University, Hartshill Road, Stoke-on-Trent ST4 7NY, UK. Submitted for publication November 23, 1999; accepted in revised form March 1, 2000. 1496

Transcript of Subtyping of juvenile idiopathic arthritis using latent class analysis

Page 1: Subtyping of juvenile idiopathic arthritis using latent class analysis

ARTHRITIS & RHEUMATISMVol. 43, No. 7, July 2000, pp 1496–1503© 2000, American College of Rheumatology

SUBTYPING OF JUVENILE IDIOPATHIC ARTHRITISUSING LATENT CLASS ANALYSIS

ELAINE THOMAS, JENNIFER H. BARRETT, RACHELLE P. DONN, WENDY THOMSON,TAUNTON R. SOUTHWOOD, and the BRITISH PAEDIATRIC RHEUMATOLOGY GROUP

Objective. To use statistical techniques to identifyunderlying subtypes of juvenile idiopathic arthritis (JIA)that best explain the observed relationships of clinicaland laboratory variables, and to compare the statisti-cally derived subtypes with those defined by the Inter-national League of Associations for Rheumatology (ILAR)criteria and examine them for HLA associations.

Methods. Information on 572 patients diagnosedas having JIA was summarized by 10 clinical andlaboratory categorical variables (age at onset, largejoint involvement, small joint involvement, polyarthri-tis, symmetric arthritis, spinal pain, fever, psoriasis,antinuclear antibodies [ANA], and rheumatoid factor).Latent class analysis (LCA) was used to identify under-lying (“latent”) classes that explained the relationshipsamong the observed variables. Statistical models incor-porating 5–8 latent classes were applied to the data.

Results. The 7-class model was the most appro-priate. Patterns of joint involvement and the presence ofANA were influential in determining latent classes.There was some correspondence between the latentclasses and the ILAR categories, but they did notcoincide completely. Significant differences between thelatent classes were seen for 3 HLA haplotypes(DRB1*04-DQA1*03-DQB1*03, DRB1*13-DQA1*01-DQB1*06, and DRB1*08-DQA1*0401-DQB1*0402).

Conclusion. LCA provides a novel approach to thetask of identifying homogeneous subtypes within theumbrella of JIA. In further work, the identified latent

classes will be examined for associations with othercandidate genes and for differences in outcome.

Juvenile idiopathic arthritis (JIA), previouslytermed juvenile chronic arthritis (JCA) in Europe (1)and juvenile rheumatoid arthritis in North America (2),is the most common rheumatic disease experienced inchildhood. Based on data from the UK, the estimatedannual incidence of JCA is 10 per 100,000 children (3),and the prevalence is 65 per 100,000 children (4). Theterm JIA indicates diseases of childhood onset charac-terized primarily by arthritis persisting for at least 6weeks and currently having no known cause. JIA iscomplex, with both environmental and genetic risk fac-tors; in particular, associations with certain HLA alleleshave been established (5). It encompasses a spectrum ofdiseases that vary widely in onset characteristics, clinicalcourse, associated manifestations, and ultimate out-comes. There are no diagnostic tests available for JIA;categories are based on clinical features and are highlydependent on the classification system used.

Children with persistent arthritis have previouslybeen subdivided into 3 groups (polyarticular onset,pauciarticular onset, and systemic onset) on the basis ofthe number of joints affected at disease onset and avariety of extraarticular clinical features. More recently,a variety of classification systems have been proposed,including the European League Against Rheumatismcriteria (1) and the American College of Rheumatologycriteria (2), with additional criteria for juvenile spon-dylarthropathies (6) and psoriatic arthritis (7,8). Themost recent proposal, the International League of As-sociations for Rheumatology (ILAR) classification, wasformulated by consensus among pediatric rheumatolo-gist representatives of the Leagues Against Rheumatismin 1995 (9) and modified in 1997 (10). The proposedclassification was intended to begin a process of valida-tion and modification which would eventually result in agreater understanding of persistent arthritis that begins

Supported by the Arthritis Research Campaign (ARC), UK.Elaine Thomas, PhD, Jennifer H. Barrett, PhD, Rachelle P.

Donn, PhD, Wendy Thomson, PhD: ARC Epidemiology Unit, Uni-versity of Manchester, Manchester, UK; Taunton R. Southwood,FRCP: University of Birmingham, Birmingham, UK.

Address reprint requests to Elaine Thomas, PhD, Industrialand Community Health Research Centre, School of PostgraduateMedicine, Keele University, Hartshill Road, Stoke-on-Trent ST4 7NY,UK.

Submitted for publication November 23, 1999; accepted inrevised form March 1, 2000.

1496

Page 2: Subtyping of juvenile idiopathic arthritis using latent class analysis

in childhood. The ILAR classification includes 7 cate-gories and a category termed “other,” for children whofit none of the 7 categories or who fit more than onecategory (10).

As with previous classification systems, the ILARcriteria are based on clinicians’ perceptions of clinicaldisease patterns and may not define biologically uniquesubgroups. For research in epidemiology, genetics, out-come studies, and trials of therapies in JIA, it is essentialto be able to identify homogeneous groups of patients.We have gathered detailed clinical and laboratory infor-mation on a large group of JIA patients in the UK. Ouraim in the present study was to use these data todetermine whether it is possible to statistically derivehomogeneous subsets based on the pattern of selectedfeatures. To do this, we used the statistical method oflatent class analysis (LCA) (11). Our objectives were 1)to identify underlying subtypes of JIA that best explainthe observed relationship between variables (clinical andlaboratory); 2) to compare the statistically derived sub-types of JIA with those defined by the ILAR criteria;and 3) to examine the statistically derived subtypes ofJIA for HLA associations, as a means of assessing thebiologic relevance of the resulting classes.

PATIENTS AND METHODS

Clinical data on patients in the British PaediatricRheumatology Group National Repository for JIA were ob-tained. The patients were from 17 centers within the UK. Forall patients a detailed form was completed, recording informa-tion on 24 clinical and 6 laboratory variables, and a 1–2-mlsample of blood was obtained for DNA extraction. The 30variables were ranked as to their perceived importance indisease development, and the 10 most important variables,listed in Table 1, were used in the modeling procedure.Definitions of variables followed those given in the revisedILAR classification criteria (10). Briefly, large joints weredefined as the hip, knee, ankle, wrist, elbow, and glenohumeraljoint, with all other joints defined as small. Polyarthritis wasdefined as the presence of arthritis in at least 5 individualjoints, and spinal pain as pain in either the lumbar, cervical, ordorsal region. Fever was classified as present if it occurreddaily for at least 2 weeks, with a classic quotidian pattern. Adiagnosis of psoriasis was confirmed by a dermatologist. Pos-itivity for rheumatoid factor (RF) and antinuclear antibodies(ANA) was recorded if there were positive results on at least 2tests 3 months apart. The age at onset of arthritis wascategorized into 3 groups: ,5 years, 5–9 years, and 10–16years. All other variables were dichotomized as either presentor absent. A total of 572 patients with complete data for eachof these 10 factors was used in the analysis. Data on theobserved prevalences of the 10 variables used to define themodel, as well as on sex and ethnic origin of the patient group,are presented in Table 1.

Caucasian patients were typed for 3 HLA loci: 438 forHLA–DRB1, 421 for DQA1, and 428 for DQB1; 402 patientswere typed for all 3 loci. These 3 loci were chosen becausethere is strong linkage disequilibrium across the major histo-compatibility complex class II region, and DRB1/DQA1/DQB1 haplotypes could be inferred for subjects when data forall 3 loci were available (12). The HLA region is highlypolymorphic, and the use of haplotypes reduces the problem ofmultiple testing inherent when examining loci individually.Three hundred sixty-seven Caucasian subjects from the UKwere used as controls. The controls were from 3 sources: 118were recruited from general practice registers for a case–control study of putative environmental risk factors for rheu-matoid arthritis (13), 159 were part of a population-basedsurvey conducted to identify possible risk factors for cancer(14), and 90 were blood donors recruited as controls forvarious disease studies. All 367 controls were typed for HLA–DRB1, 138 were typed for DQA1, and 155 for DQB1. HLA–DRB1 and DQB1 alleles were determined using a commer-cially available semiautomated polymerase chain reaction–sequence-specific oligonucleotide probe (PCR-SSOP) typingsystem (Inno-LiPA; Abbotts, Maidenhead, UK). DQA1 alleleswere also determined by PCR-SSOP (15). In addition, 398 ofthe patients were typed for the presence or absence ofHLA–B27, using PCR–sequence-specific primers.

Statistical analysis. Latent class model. Each subject’scharacteristics can be described by a symptom profile record-ing the absence (recorded as 0) or presence (recorded as 1) ofthe 9 symptoms listed in Table 1, followed by a 0, 1, or 2depending on the age at disease onset (,5 years recorded as 0,5–9 years as 1, 10–16 years as 2). For example, a subject withlarge and small joint involvement, symmetric polyarthritis,spinal pain, and psoriasis, who is both ANA negative and RF

Table 1. Characteristics of the study population

Characteristic No. (%)

SexFemale 384 (67)Male 188 (33)

EthnicityCaucasian 531 (93)African/Afro-Caribbean 6 (1)Asian 24 (4)Other 4 (1)Data missing 7 (1)

Factors used in latent class analysis*Large joint involvement 477 (83)Small joint involvement 375 (66)Polyarthritis 336 (59)ANA positive 190 (33)Symmetric arthritis 286 (50)Spinal pain 131 (23)Fever 95 (17)RF positive 41 (7)

Psoriasis 35 (6)Age at onset

Under 5 years 277 (48)5–9 years 163 (28)10–16 years 132 (23)

* ANA 5 antinuclear antibody; RF 5 rheumatoid factor.

LATENT CLASS ANALYSIS FOR JIA SUBTYPING 1497

Page 3: Subtyping of juvenile idiopathic arthritis using latent class analysis

negative, has no fever, and whose disease began before the ageof 5 years would have symptom profile 1110110010.

LCA is based on the assumption that the frequencieswith which the different symptom profiles occur can be ex-plained by a small number of mutually exclusive classes, witheach class having a distinct set of item probabilities that isconstant for all members of that particular class (11). A criticalaspect of this assumption is that, within a class, the probabili-ties of different symptoms are statistically independent. This isequivalent to assuming that the latent classes explain theclustering of symptoms observed in the overall patient group.For a given latent class model, parameter estimates includeclass membership probabilities, g, which may be thought of asthe prevalence of the different subgroups, and symptom prob-abilities, r, which reflect the likelihood that a symptom ispresent in an individual, given membership in that class.

Latent class models were fitted to the 10 factors listedin Table 1 using the computer program LTA (16), whichimplements a version of Goodman’s EM (expectation-maximization) procedure (11). Models estimating varyingnumbers of latent classes were compared by means of thelikelihood ratio chi-square statistic, G2 (17). If increasing thenumber of latent classes provides no better explanation ofthe data, then the difference in G2 between 2 models (asymp-totically) follows a chi-square distribution, with degrees offreedom given by the difference in the number of fittedparameters.

Assigning individuals to the latent classes. LCA does notseparate patients into distinct categories and so is not directlya classification system. However, for any particular symptomprofile, the posterior probability of membership in each of thelatent classes can be calculated, based on the estimatedparameters from the latent class solution. This is done by asimple application of Bayes’ rule, as detailed in Appendix A.

The probability of membership in each of the latentclasses was calculated for each observed symptom profile.Subjects were assigned to a particular class if they belonged tothat class with a probability of $0.7. This cutoff ensures thatthe patient is more than twice as likely to be in this class as inany other. Those who could not be assigned to a single classwere termed “unclassifiable.”

Comparison of latent class solution and ILAR classifica-tion. A cross-tabulation of the latent class grouping and ILARclassification was generated. This comparison allowed us toexamine the correspondence between results obtained with the2 methods of subtyping.

HLA associations. Haplotype frequencies were calcu-lated for the JIA patients and were compared across thederived latent classes. Differences were assessed using Pear-son’s chi-square test, with no adjustment for multiple testing.Where significant differences were recorded (P , 0.05),haplotype frequencies in the individual classes were comparedwith those in the control group. Associations were expressed interms of odds ratios and 95% confidence intervals, using theCornfield approximation (18). HLA association analyses wereconducted using the Stata statistical software package (19).

RESULTS

The majority of the patients were female (67%)and of Caucasian ethnic origin (93%). The 3 most

common individual clinical features were large jointinvolvement, small joint involvement, and arthritis in-volving .4 joints (Table 1). Of the 1,536 (29 3 3)possible symptom profiles, only 175 were observed.

Statistically significant improvements in fit wereobtained for 5-class to 8-class models. Table 2 summa-rizes the changes in the likelihood ratio statistic. Therewas less improvement when moving from 7 to 8 classes,and the 8-class solution yielded some classes with verylow prevalences (3 classes ,5%), suggesting that theywere too small to be meaningful. For these reasons, the7-class solution is presented.

Table 3 presents the probabilities of the differentsymptoms, conditional on latent class membership (the rparameters), and the probabilities of membership ineach latent class (the g parameters). Class prevalencesranged from 6% to 21%. Below, as an example, is a briefdescription of the characteristics of latent class 1.

Class 1 was composed of patients with symmetricpolyarthritis affecting small joints, and probably largejoints (with a probability of 0.84). Patients were alsoquite likely to be RF positive (0.56) and may havereported spinal pain (0.36), but were unlikely to be ANApositive (0.22) or to have a fever (0.08) and did not havepsoriasis. They were most likely to be in the oldest agegroup at disease onset. This class accounted for 11% ofthe patients.

When assigning subjects to latent classes, in someinstances patients could be assigned almost unambigu-ously to one class; for example, patients with the symp-tom profile 1110110010 could be assigned to class 6 witha probability of 1. In other cases, more than one classassignment was compatible with the symptom profile;for example, patients with the symptom profile1011010000 had non-zero probabilities for 3 of the 7latent classes (Pclass 2 5 0.545, Pclass 3 5 0.003, Pclass 6 50.452). In total, 461 patients (81%) could be assigned toa particular class with a probability of at least 0.7.

Differences in sex distribution were found be-tween the 7 derived latent classes. Approximately similarnumbers of males and females were seen in 3 classes

Table 2. Model fitting

No. ofclasses

Likelihoodratio

statistic

Change inlikelihood

ratioDegrees

of freedom

Change indegrees

of freedom

5 415.9 1,4766 363.8 52.1 1,464 127 310.2 53.6 1,452 128 282.2 28.0 1,440 12

1498 THOMAS ET AL

Page 4: Subtyping of juvenile idiopathic arthritis using latent class analysis

(class 3, class 4, and class 7), whereas a female predom-inance was seen in the other 4 classes. Chronic uveitiswas present in at least 20% of subjects in 3 latent classes(class 2, class 5, and class 6), but was present in ,5% ofsubjects in the other 4. However, there was no differencein the distribution of subjects with acute anterior uveitis;this symptom was rare, with an overall prevalence of just3%. The percentage of patients positive for HLA–B27varied between latent classes, reaching a frequency of;20% in classes 4, 5, and 7 and of ,10% in theremaining classes.

Of the “unclassifiable” subjects, 65 (59%) hadsymptom profile 11101x000y (where x and y could be anyvalue), i.e., they had symmetric polyarthritis involvinglarge and small joints, were ANA and RF negative, had

no psoriasis or fever, but may or may not have spinalpain (x) and could have been in any age group at diseaseonset (y). These subjects were most likely to belong tolatent classes 1, 3, or 6. Other unclassifiable subjects allhad rare symptom profiles occurring in no more than 1%of patients.

Table 4 presents a cross-tabulation of the ILARclassification with the derived latent classes. There wassome correspondence between the latent classes and theILAR categories, but they did not coincide completely.Latent class 3 consisted mainly of patients classified ashaving “systemic arthritis” under the ILAR classifica-tion. However, one-fourth of those classified as having“systemic arthritis” under the ILAR classification wereassigned to other latent classes. These patients differed

Table 3. Parameter estimates for the 7–latent class (LC) solution

Factor* Category

Conditional probabilities of symptoms (r)

LC 1 LC 2 LC 3 LC 4 LC 5 LC 6 LC 7

Large joints Present 0.841 1.000 0.934 0.347 0.110 1.000 0.929Small joints Present 1.000 0.166 0.961 1.000 0.761 0.946 0.251Polyarthritis Present 1.000 0.092 1.000 0.697 0.521 1.000 0.102ANA Present 0.216 1.000 0.052 0.000 0.466 0.502 0.000Symmetric arthritis Present 1.000 0.163 0.932 0.000 0.686 0.547 0.165Spinal pain Present 0.360 0.063 0.407 0.070 0.000 0.300 0.224Fever Present 0.080 0.010 0.626 0.029 0.000 0.021 0.175RF Present 0.556 0.020 0.000 0.000 0.000 0.000 0.033Psoriasis Present 0.000 0.030 0.000 0.601 0.000 0.132 0.000Age at onset

0–4 years 0.146 0.612 0.638 0.041 0.512 0.640 0.3445–9 years 0.226 0.273 0.238 0.350 0.378 0.250 0.39910–16 years 0.628 0.115 0.124 0.609 0.110 0.110 0.257

Probability of membershipin each class (g)

0.109 0.172 0.192 0.058 0.091 0.166 0.211

* ANA 5 antinuclear antibody; RF 5 rheumatoid factor.

Table 4. Comparison of International League of Associations for Rheumatology (ILAR) classificationand derived latent classes (LC)*

ILAR classification LC 1 LC 2 LC 3 LC 4 LC 5 LC 6 LC 7 Unclassifiable Total

Systemic arthritis 1 1 61 1 0 0 19 2 85Oligoarthritis 0 77 0 2 19 1 55 7 161Extended oligoarthritis 1 8 0 1 9 35 7 28 89Polyarthritis, RF2† 2 3 8 0 14 19 7 59 112Polyarthritis, RF1 30 0 0 0 0 0 0 0 30Enthesitis related 0 2 0 1 2 2 24 9 40Psoriatic arthritis 0 5 0 21 0 13 0 4 43Other 1 1 0 1 0 0 7 2 12

Total 35 97 69 27 44 70 119 111 572

* Values are the number of patients.† RF 5 rheumatoid factor.

LATENT CLASS ANALYSIS FOR JIA SUBTYPING 1499

Page 5: Subtyping of juvenile idiopathic arthritis using latent class analysis

from the main group of ILAR “systemic arthritis”patients by their pattern of joint involvement. Unlike themajority of the “systemic arthritis” patients, very few ofthese patients had symmetric polyarthritis with spinalpain, and only a minority had small joint involvement.

The majority of patients with an ILAR classifica-tion of “psoriatic arthritis” were in latent class 4, al-though one-third of those classified as having “psoriaticarthritis” by the ILAR classification were assigned tolatent class 6. In comparison with those “psoriatic arthri-tis” patients in class 4, those in latent class 6 all had largejoint involvement and could be ANA positive, whereasthose in class 4 were all negative for ANA, or hadarthritis with a symmetric distribution.

Latent class 2 consisted mostly of patients classi-fied as having “oligoarthritis” by the ILAR classification.However, the remaining 50% of those classified ashaving “oligoarthritis” by the ILAR classification weresplit mainly between 2 other latent classes (12% in class5 and 33% in class 7). The main difference between the“oligoarthritis” patients in latent classes 2 and 7 wastheir ANA status; to be in class 2 patients had to beANA positive, whereas this would have excluded themfrom class 7. Patients in latent class 5 differed frompatients in classes 2 and 7 by the characterization of theirjoint involvement (symmetric arthritis with only smalljoint involvement).

More than half of the “unclassifiable” patientswere classified as “polyarthritis, RF negative” by theILAR system, while most of the remainder were in theILAR “extended oligoarthritis” category.

Of the 402 Caucasian patients with informationavailable for all 3 HLA loci, haplotypes could be as-signed on the basis of DRB1, DQA1, and DQB1 geno-types for 359 patients and, of those, 296 patients couldbe assigned to a particular latent class, with a probabilityof at least 0.7. Haplotypes could also be assigned to 131control subjects. When examining the prevalences ofhaplotypes between the latent classes, significant differ-ences were seen for 3 haplotypes (DRB1*04-DQA1*03-DQB1*03, DRB1*13-DQA1*01-DQB1*06, andDRB1*08-DQA1*0401-DQB1*0402). To illustratethese differences, the prevalence of the haplotypeswithin each latent class was compared with that withinthe control group, and results are presented in Table 5as odds ratios with corresponding 95% confidenceintervals.

DISCUSSION

We have applied the statistical method of LCA toinvestigate underlying disease subtypes that may explain

the distinct patterns of symptoms observed in patientswith JIA. Results are presented for a model involving 7underlying classes. When subjects were assigned tolatent classes, some correspondence between the latentclasses and ILAR categories was found, but there werealso substantial differences. The frequencies of severalHLA haplotypes were found to differ markedly betweenthe latent classes.

The diagnosis of JIA is given to children withidiopathic arthritis and a heterogeneous set of relatedclinical features. Identification of subtypes, representinghomogeneous disease populations within the umbrellaof JIA, will be important for a clearer understandingof the disease. The ILAR classification of JIA (9,10),based on previous classification systems (1,2,6–8), is anattempt to do this, but it is recognized that validationand modification of this classification system is ongo-ing (20).

Since JIA is thought to encompass several dis-tinct disease subtypes, LCA is an appropriate statisticalmethod to use. The method fits the model of underlyingsubtypes that best explain the observed patterns ofconcomitant occurrence of symptoms. The LCA ap-proach differs from a conventional classification systemsuch as the ILAR system, which is based on easilyapplied rules and is useful to the clinician in categorizingpatients. LCA does not provide a set of rules by whichany JIA patient can be classified, although for anypatient the most likely latent subtype(s) can be deter-mined. (In this application most subjects [81%] couldbe assigned with some certainty [probability $0.7 orhigher] to a particular latent class.) This is in some waysmore realistic since it is likely that some patterns ofsymptoms at presentation may be consistent with morethan one disease subtype, but it may be of less use in aclinical setting.

The LCA approach provides a means for objec-tive analysis of the pattern of symptom profiles, andcomparison of the results with those obtained withcurrently used subtyping systems may indicate weak-nesses in the existing classification system. It is of notethat, although symptoms such as fever, spinal pain,psoriasis, and RF status were included in our analysis,none of these were defining features of any of the latentclasses, in contrast with the ILAR system. Instead thelatent class solution focused more on patterns of jointinvolvement and ANA positivity, suggesting that thesemay be additional features that could be included infuture sets of classification criteria.

We examined further the very strong associationbetween HLA haplotype DRB1*08-DQA1*0401-

1500 THOMAS ET AL

Page 6: Subtyping of juvenile idiopathic arthritis using latent class analysis

Tab

le5.

Odd

sra

tios

asso

ciat

edw

ithpo

sses

sion

ofH

LA

hapl

otyp

es(D

RB

1/D

QA

1/D

QB

1)in

Cau

casi

anju

veni

leid

iopa

thic

arth

ritis

(JIA

)pa

tient

sve

rsus

Cau

casi

anco

ntro

ls,

byst

atis

tical

lyde

rive

dla

tent

clas

s(L

C)

HL

Aha

plot

ype

Odd

sra

tio(9

5%co

nfid

ence

inte

rval

)

P*

LC

1(n

523

)L

C2

(n5

57)

LC

3(n

543

)L

C4

(n5

22)

LC

5(n

530

)L

C6

(n5

45)

LC

7(n

576

)

DR

B1*

04-D

QA

1*03

-DQ

B1*

033.

24(1

.3–8

.0)†

0.13

(0.1

–0.4

)†0.

40(0

.2–0

.9)†

0.65

(0.2

–1.7

)0.

19(0

.1–0

.6)†

0.27

(0.1

–0.7

)†0.

70(0

.4–1

.3)

,0.

0001

DR

B1*

13-D

QA

1*01

-DQ

B1*

060.

16(0

.0–1

.0)†

2.21

(1.1

–4.3

)†0.

17(0

–0.7

)†0.

78(0

.3–2

.4)

0.70

(0.3

,1.9

)1.

28(0

.6–2

.8)

0.41

(0.2

–0.9

)†,

0.00

01D

RB

1*08

-DQ

A1*

0401

*-D

QB

1*04

021.

98(0

.0–9

.3)

9.62

(3.7

–25)

†3.

38(1

.1–1

1)†

2.08

(0.0

–9.8

)7.

58(2

.5–2

3)†

7.58

(2.7

–21)

†1.

47(0

.5–4

.7)

0.00

2

*C

orre

spon

ding

Pva

lue

for

Pear

son’

sch

i-squ

are

test

ofth

edi

ffer

ence

inha

plot

ype

freq

uenc

ies

acro

ssla

tent

clas

ses.

Res

ults

are

only

show

nfo

rha

plot

ypes

whe

reth

ere

was

evid

ence

ofa

diff

eren

cebe

twee

nst

atis

tical

lyde

rive

dJI

Acl

asse

s.†

P,

0.05

.

LATENT CLASS ANALYSIS FOR JIA SUBTYPING 1501

Page 7: Subtyping of juvenile idiopathic arthritis using latent class analysis

DQB1*0402 and latent class 2 (odds ratio 9.6 [Table 5]),which suggests that the LCA may have successfullyidentified a genetically homogeneous group. From Table4 it can be seen that latent class 2 consists mainly ofsubjects classified by the ILAR system as having “oligo-arthritis,” but more than half of the “oligoarthritis”patients were in different latent classes, mainly in class 7.The haplotype was almost 3 times as frequent in patientsin class 2 as in the “oligoarthritis” patients in class 7(32% versus 12%; P 5 0.04 by Fisher’s exact test). FromTable 3 it can be seen that the main difference betweenlatent classes 2 and 7 lies in the ANA status, which mustbe positive in class 2 and negative in class 7.

A number of methodologic issues should beconsidered in interpreting these results. First, LCA is anobjective analytic method, but the results may be sensi-tive to the choice of which factors are used in theanalysis. Other clinical features, such as presence of arash or family history data, could have been included asalternatives. The number of possible symptom profilesincreases exponentially as additional features are in-cluded in the analysis, so without a much larger samplesize of patients available for study, the number offeatures included in the analysis should not be increased.A further option would have been to include HLA dataas a feature in the LCA model. However, this would bevery difficult to do practically, since it would be neces-sary to reduce the HLA genotype to a small number ofcategories. Furthermore, it was one of our aims toinvestigate whether latent classes showed genetic hetero-geneity between groups at the HLA locus.

Second the choice of the number of classes toinclude in the model can, in theory, be decided on purelystatistical grounds using the likelihood ratio test. How-ever, the validity of this test rests on the chi-squaredistribution of the likelihood ratio statistic in largesamples. Although we had a sample size of 572 patients,because of the large number of symptom profiles, Pvalues obtained from the chi-square distribution areunlikely to be correct. This leads to difficulties whenchoosing between models. Our choice of a solution with7 latent classes was therefore not based on purelystatistical grounds, and should not be regarded as evi-dence that there are precisely 7 categories in JIA.

This exploratory LCA provides a novel approachto the task of identifying homogeneous subtypes withinthe umbrella of JIA. To investigate the general applica-bility of the solution, we intend to apply the method toa set of JIA patients from other European countries.Our expectation is that latent classes with similar char-acteristics, but possibly different frequencies, will be

found. The biologic significance of the subtypes identi-fied will be demonstrated only by 1) finding genetichomogeneity within subgroups and heterogeneity be-tween subgroups, or 2) showing differences between thesubgroups with respect to disease course. Genetic dif-ferences between the latent classes at the HLA locushave already been demonstrated. In further work, othercandidate genes for involvement in JIA will be exam-ined, and, in the long term, outcome across the latentclasses will be compared.

REFERENCES

1. Wood PHN. Nomenclature and classification of arthritis in chil-dren. In: Munthie E, editor. The care of rheumatic children. Basel:EULAR Publishers; 1978. p. 47.

2. Brewer EJ Jr, Bass J, Baum J, Cassidy JT, Fink C, Jacobs J, et al.Current proposed revision of JRA criteria. Arthritis Rheum1977;20 Suppl 2:195–9.

3. Symmons DPM, Cosgrove S. The epidemiology of juvenile arthri-tis in the United Kingdom [abstract]. Arthritis Rheum 1993;36Suppl 9:S172.

4. Bywaters EGL. Diagnostic criteria for Still’s disease (juvenile RA).In: Bennett PH, Wood PHN, editors. Population studies of therheumatic diseases: proceedings of the 3rd International Sympo-sium. Amsterdam: Excerpta Medica; 1968. p. 235.

5. Donn RP, Ollier WER. Juvenile chronic arthritis—a time for achange? Eur J Immunogenet 1996;23:245–60.

6. Prieur AM, Listrat V, Dougados M, Amor B. Evaluation of theESSG and the Amor criteria for juvenile spondylarthropathies:study of 310 consecutive children referred to one pediatric rheu-matology center [abstract]. Arthritis Rheum 1990;33 Suppl 9:S160.

7. Southwood TR, Petty RE, Malleson PN, Delgado EA, HuntDWC, Wood B, et al. Psoriatic arthritis in children. ArthritisRheum 1989;32:1007–13.

8. Truckenbrodt H, Hafner R. Psoriatic arthritis in childhood: acomparison with subgroups of chronic juvenile arthritis. Z Rheu-matol 1990;49:88–94.

9. Fink CW, and the ILAR Task Force for Classification Criteria. Aproposal for the development of classification criteria for theidiopathic arthritides of childhood. J Rheumatol 1995;22:1566–9.

10. Petty RE, Southwood TR, Baum J, Bhettay E, Glass DN, MannersP, et al. Revision of the proposed classification criteria for juvenileidiopathic arthritis: Durban, 1997. J Rheumatol 1998;25:1991–4.

11. Goodman LA. Explanatory latent structure analysis using bothidentifiable and unidentifiable models. Biometrika 1974;61:215–31.

12. Thomson W, Barrett JH, Pepper L, Donn R, Kennedy L, OllierWER, et al. HLA associations with JIA in UK patients classifiedusing ILAR criteria. Submitted for publication.

13. Symmons DPM, Bankhead CR, Harrison BJ, Brennan P, BarrettEM, Scott DGI, et al. Blood transfusion, smoking, and obesity asrisk factors for the development of rheumatoid arthritis: resultsfrom a primary care-based incident case–control study in Norfolk,England. Arthritis Rheum 1997;40:1955–61.

14. Riboli E. Nutrition and cancer: background and rationale of theEuropean Prospective Investigation into Cancer and Nutrition(EPIC). Ann Oncol 1992;3:783–91.

15. Bignon JD, Fernandez-Vina MA, Cheneau ML, Fauchet R,Schreuder GMT, Clayton J, et al. HLA DNA class II typing byPCR-SSOP: 12th international histocompatibility experience. In:Charron D, editor. Proceedings of the Twelfth International

1502 THOMAS ET AL

Page 8: Subtyping of juvenile idiopathic arthritis using latent class analysis

Histocompatibility Workshop and Conference. Vol. 1. Paris:EDK; 1997. p. 21–5.

16. Collins LM, Wugalter SE, Rousculp SS. LTA users’ manual. LosAngeles: JP Guilford Laboratory of Quantitative Psychology,University of Southern California; 1991.

17. Agresti A. An introduction to categorical data analysis. New York:John Wiley and Sons; 1996.

18. Breslow NE, Day NE. Statistical methods in cancer research. Vol.1. The analysis of case-control studies. Lyon (France): Interna-tional Agency for the Research on Cancer; 1980.

19. Stata Corporation. Stata statistical software: release 5.0. CollegeStation (TX): Stata Corporation; 1997.

20. Petty RE. Classification of childhood arthritis: a work in progress.Baillieres Clin Rheumatol 1998;12:181–90.

APPENDIX A: CALCULATION OF THE POSTERIORPROBABILITY OF MEMBERSHIP IN A LATENT CLASS

Let X 5 [x(1), x(2), . . . , x(n)] be the vector of n symptomvariables on which the analysis is based, where x(i) can take values 0,

1, . . . , ri, e.g., 0 /1 for absent / present. Let C1, C2, . . . , Cm be the mlatent classes, and gj be the prevalence of class Cj, j 5 1, . . . , m.Denote by rijx the probability that the symptom variable x(i) takes thevalue x, for patients in class Cj, i 5 1, . . . , n, j 5 1, . . . , m, and x 5 0,1, . . . , ri.

The posterior probability of membership in class Cs, given aparticular symptom profile x(1), x(2), . . . , x(n) is given by

P~Cs ? X 5 @x~1!, . . . , x~n!#! 5P~X 5 @x~1!, . . . , x~n!# ? Cs!P~Cs!

P~X 5 @x~1!, . . . , x~n!#!

5

gs Pi51

n

r isx~i!

Oj51

m

g j Pi51

n

r ijx~i!

A program was written in C11 to calculate the posteriorprobabilities of membership in each of the latent classes for allobserved symptom profiles.

LATENT CLASS ANALYSIS FOR JIA SUBTYPING 1503