Measurement properties of clinical assessment methods for ... · Measurement Properties of Clinical...
Transcript of Measurement properties of clinical assessment methods for ... · Measurement Properties of Clinical...
American Journal of Medical Genetics Part C (Seminars in Medical Genetics) 175C:116–147 (2017)
R E S E A R C H R E V I E W
Measurement Properties of Clinical AssessmentMethods for Classifying Generalized JointHypermobility—A Systematic ReviewBIRGIT JUUL-KRISTENSEN,* KAROLINE SCHMEDLING, LIES ROMBAUT, HANS LUND,AND RAOUL H. H. ENGELBERT
Dr. Birgit JuFunction and P
Karoline SchUniversity Coll
Dr. Lies RomDr. Hans Lun
Physiotherapy,Clinical BiomeNorway.
Dr. Raoul EnFaculty of Hea
Conflicts ofappearance of
*CorresponBiomechanics,
DOI 10.100Article first
� 2017 Wil
The purpose was to perform a systematic review of clinical assessment methods for classifying Generalized JointHypermobility (GJH), evaluate their clinimetric properties, and perform the best evidence synthesis of thesemethods. Four test assessment methods (Beighton Score [BS], Carter and Wilkinson, Hospital del Mar, Rotes-Querol) and two questionnaire assessmentmethods (Five-part questionnaire [5PQ], Beighton Score-self reported[BS-self]) were identified on children or adults. Using the Consensus-based Standards for selection of healthMeasurement Instrument (COSMIN) checklist for evaluating themethodological quality of the identified studies,all included studies were rated “fair” or “poor.”Most studies were using BS, and for BS the reliability most of thestudies showed limited positive to conflicting evidence, with some shortcomings on studies for the validity. Thethree other test assessment methods lack satisfactory information on both reliability and validity. For thequestionnaire assessment methods, 5PQ was the most frequently used, and reliability showed conflictingevidence, while the validity had limited positive to conflicting evidence comparedwith test assessmentmethods.For BS-self, the validity showed unknown evidence compared with test assessment methods. In conclusion,following recommended uniformity of testing procedures, the recommendation for clinical use in adults is BSwith cut-point of 5 of 9 including historical information, while in children it is BS with cut-point of at least 6 of 9.However, more studies are needed to conclude on the validity properties of these assessment methods, andbefore evidence-based recommendations can be made for clinical use on the “best” assessment method forclassifying GJH. © 2017 Wiley Periodicals, Inc.
KEYWORDS: Beighton tests; Carter andWilkinson; Rotes-Querol; Hospital delMar; five-part questionnaire; self-reported; clinimetrics; qualityassessment; COSMIN; best evidence synthesis
How to cite this article: Juul-Kristensen B, Schmedling K, Rombaut L, Lund H, Engelbert RHH. 2017.Measurement properties of clinical assessment methods for classifying generalized joint hypermo-
bility—A systematic review. Am J Med Genet Part C Semin Med Genet 175C:116–147.
INTRODUCTION
Generalized joint hypermobility (GJH)is relatively common, occurring inabout 2–57% of different populations[Remvig et al., 2007b]. Important
ul-Kristensen, Associate professor,hysiotherapy, University of Southermedling, M.Sc., PT, Department oege, Bergen, Norway.baut, Postdoctoral researcher, PT,d, Associate professor, PT, DepartmUniversity of Southern Denmark, Ochanics, University of Southern Den
gelbert, Department of Rehabilitatilth, Center for Applied Research, Uinterest: There are no other financiala conflict of interest with regards tdence to: Birgit Juul-Kristensen, ResUniversity of Southern Denmark, C2/ajmg.c.31540published online in Wiley Online Lib
ey Periodicals, Inc.
reasons for this may be the use ofmany different clinical assessment meth-ods and criteria for classification andinterpretation of GJH by these clinicalassessment methods [Remvig et al.,2007a,b]. GJH is characterized by an
PT, Department of Sports Sciences and Clinical Bion Denmark, Odense, Denmark.f Health Sciences, Institute of Occupational Thera
Center for Medical Genetics, Ghent University Hospent of Sports Sciences and Clinical Biomechanics, R
dense, Denmark; SEARCH (Synthesis of Evidence andmark, Odense, Denmark; Center for Evidence-Base
on, Academic Medical Center, University of Amsterniversity of Applied Sciences, Amsterdam, the Nethinterests that any of the authors may have, which coo the work.earch Unit for Musculoskeletal Function and Physiotampusvej 55, DK-5230 Odense M, Denmark. E-ma
rary (wileyonlinelibrary.com).
ability to exceed the joints beyond thenormal range of motion in multiplejoints, either congenital or acquired[Remvig et al., 2011]. Many individualswith GJH are asymptomatic, which alsomakes it difficult to accurately estimate
mechanics, Research Unit of Musculoskeletal
py, Physiotherapy and Radiography, Bergen
ital, Ghent, Belgium.esearch Unit of Musculoskeletal Function andResearch), Department of Sports Science and
d Practice, Bergen University College, Bergen,
dam, Amsterdam, the Netherlands; ACHIEVE,erlands.uld create a potential conflict of interest or the
herapy, Institute of Sports Science and Clinicalil: [email protected]
RESEARCH REVIEW AMERICAN JOURNAL OF MEDICAL GENETICS PART C (SEMINARS IN MEDICAL GENETICS) 117
the number of people with this condi-tion, as they are not recorded in thehealth system.
When GJH is accompanied withsymptoms, it is defined as a health-relateddisorder, for example, Joint Hypermo-bility Syndrome (JHS) or the Ehlers–Danlos Syndrome—Hypermobile Type(hEDS) with several complications asdescribed below. The two conditions(JHS and hEDS) have very close overlapto the point of being clinically indistin-guishable [Tinkle et al., 2009; Remviget al., 2011], and in the present study it isreferred to as JHS/hEDS. The conditionof JHS/hEDScan bedefined as anunder-and often misdiagnosed heritable con-nective tissue disorder, characterizedgenerally by GJH, complications of jointinstability, musculoskeletal pain, skininvolvement, and reduced quality of life[Rombaut et al., 2010; Castori et al.,2014; Scheper et al., 2016]. Until now,JHS is diagnosed by the Brighton testsand criteria [Grahame et al., 2000], andhEDS by the Villefranche criteria[Beighton et al., 1998], both includingthe Beighton scoring (BS) system of ninetests for assessment of GJH [Beightonet al., 1973].
BS consists of four bilateral testsand one test including low back andlower extremities (first finger opposi-tion, fifth finger extension, elbowextension, knee extension, and backforward bending), with scores rangingfrom 0 to 9. Influencing factors on BSare age, gender, ethnicity, and physicalfitness [Remvig et al., 2007b; Tinkleet al., 2009]. For adults, a cut-point of4/9 for GJH is included in theBrighton criteria for JHS [Grahameet al., 2000], while 5/9 for GJH is thecriteria for hEDS in the Villefranchecriteria [Beighton et al., 1998]. Forchildren, there is no consensus on aspecific cut-point for GJH, but cut-points of 5/9, 6/9, and 7/9 have beensuggested [Jansson et al., 2004]. Aprevious review has listed differenttest assessment methods of which theBeighton score [Beighton et al.,1973] was most frequently used. Thereview concluded that reproducibilityof Beighton or similar tests is good, butthere is lack of evidence for the validity
of this test assessment method [Remviget al., 2007a].
For adults, a cut-point of 4/9for GJH is included in theBrighton criteria for JHS,while 5/9 for GJH is thecriteria for hEDS in theVillefranche criteria. For
children, there is no consensuson a specific cut-point for
GJH, but cut-points of 5/9,6/9, and 7/9 have been
suggested.
Further, also questionnaire assess-ment methods are used for classifyingGJH, among which the five-part ques-tionnaire (5PQ) [Hakim and Grahame,2003; Mulvey et al., 2013]. The 5PQ, sofar used only for adults, consists of fivequestions, including actual and historicalinformation about joint hypermobility(forward bending of the back, first fingeropposition, the ability to amuse friendswith strange body shapes, dislocation ofshoulder/knee, perception of beingdouble-jointed). The 5PQ is claimedto have good reproducibility, in additionto satisfactory sensitivity and specificity[Hakim and Grahame, 2003]. However,clinimetric properties (reliability, differ-ent aspects of validity, and responsive-ness) have not been described fully forBS, 5PQ, or other potential clinicalassessment methods for classifying GJH.
Clear and valid diagnostic clinicalassessment methods and criteria forclassifying GJH with or without symp-toms are essential, both for diagnosingJHS/hEDS and measuring treatmenteffects of JHS/hEDS, in children[Scheper et al., 2013] as well in adults[Palmer et al., 2014; Smith et al., 2014;Scheper et al., 2016]. In summary, there islack of knowledge of clinimetric proper-ties on clinical assessment methods forclassifying GJH. Therefore, the purposeof this study was to perform a systematic
review for identifying the clinical assess-ment methods for classifying GJH, toevaluate their clinimetric properties(reliability and validity), and finally tosummarize the best evidence synthesis ofthese clinical assessment methods.
MATERIALS ANDMETHODS
This systematic review followed thePreferred Reporting Items for System-atic Reviews and Meta-Analysis state-ment guidelines, PRISMA, [Moheret al., 2009] and used the PICOSmethodto present the chosen research questions:Participants (humans with GJH andhealthy controls, ranging from childhoodto adults), Intervention (assessmentmethods for evaluation and classificationof GJH), Comparison (e.g., healthycontrol groups), Outcomes (reliability/validity), and Study design (e.g., reliabil-ity/case control/longitudinal studies).
The overall method used in thisreview can be divided into four steps: (1)Compile an exhaustive list of assessmentmethods for GJH on the basis of aninitial search (Search 1); (2) Additionalsearch for studies including clinimetricsof the identified assessment methodsfrom Search 1 (Search 2); (3) Criticallyappraisal of the methodological qualityof the identified measurement proper-ties in each study; and (4) synthesizing ofthe evidence as “a best evidencesynthesis.”
Selection Criteria
Search 1With restrictions on the date of publi-cation (January 01, 1965 to Decem-ber 31, 2015), humans and English, theincluded articles had to meet thefollowing criteria: (1) be originallypublished in peer-reviewed journalsinvolving human participants; (2) in-clude a clinical assessment method (testor questionnaire) to classify GJH; and (3)be reported in English. Studies wereexcluded if they: (1) contained otheradvanced assessment methods used asprimary assessment method and not as areference assessment; (2) were reviews,abstracts, theses, unpublished studies
118 AMERICAN JOURNAL OF MEDICAL GENETICS PART C (SEMINARS IN MEDICAL GENETICS) RESEARCH REVIEW
(“gray literature”); or (3) were animalstudies.
Search 2By using the names of the differentassessment methods found in Search 1,Search 2 was initiated, and the articleswere included if they: (1) explicitlyoutlined a purpose for evaluating clini-metric properties of an assessmentmethod (test or questionnaire) forclassifying GJH; and (2) included at leastone of the clinimetric properties ofreliability, validity, and responsiveness.To avoid confusion in relation to theterminology of clinimetric properties,this study relates to the COSMINterminology, including reliability (reli-ability and measurement error), validity(criterion validity and hypotheses test-ing), and responsiveness [Schellingerh-out et al., 2008; Mokkink et al., 2010].
Search Strategy and Data Sources
Search 1 (production of a list of clinicalassessment methods)The systematic review was performed byelectronic and manual searches in CI-NAHL (n¼ 153), Embase (n¼ 1,027),SportDiscus (n¼ 272), and MEDLINE(n¼ 833). Furthermore, reference lists ofrelevant articles were hand-searched foradditional literature, and the authorsconferred with experts within the fieldof GJH, in order to make sure no relevantarticles would be missing. In each of thefour databases, the following search termswere used for the Electronic Search 1 forproducing a method list: (joint�; hyper-mobility; instability; laxity; general�)AND(evaluation�; rating�; rate�; questionnaire�;test�; scale�; assess�; examin�; observ�;diagnos�; measure�) NOT (fracture�;surgical). The search terms were adjustedto the different databases where necessary.In all databases, the search fields includedtitle, abstract, and keywords.
Search 2 (identifying clinimetric properties)For the Electronic Search 2, using thesame databases as described in Search 1,and with a total of six identified assess-ment methods at this point, a total of 24searches were conducted; one in eachdatabase, on each assessment method.
The following search terms were used,for retrieving studies with clinimetricproperties on each of the six identifiedassessment methods: (psychometric�;clinimetric�; reproducibility; reliability;repeatability; responsiveness; sensitivity;specificity; validity; diagnos�; feasibility).When including the questionnaire as-sessment methods, the terms test� andtool� were left out. See PRISMA flowchart (Fig. 1) for the selection process.
Data Extraction and QualityAssessment
Two authors (KS/BJK) independentlyscreened the titles and abstracts, and agreedupon a final list of assessmentmethods tobeincluded in the current review. If therewere any disagreements, the full paper wasretrieved for detailed assessment, andconsensus was achieved. A third reviewer(RHE) was included if disagreement stillexisted. The handling of data wereperformed with the use of EndNoteWeb(https://www.myendnoteweb.com/), foreasy access and organizing of data. Screen-ing for additional references were per-formed based on the retrieved articles.
Eligible studies for each of theretrieved assessment methods were eval-uated by the Consensus-based Standardsfor the selection of health MeasurementInstruments (COSMIN) checklist forevaluating the methodological qualityon clinimetric properties—reliability,validity, and responsiveness [Mokkinket al., 2010]. The COSMIN checklist iscurrently the only recommended stan-dardized method [Terwee et al., 2012],and has been used in several differentstudies of clinical test assessmentmethods[Larsen et al., 2014; Kroman et al., 2014].The complete COSMIN checklist in-cludes 12 boxes, covering internal con-sistency, reliability, measurement error,validity, and responsiveness. The currentstudy used the reliability and validitydomains, in particular box B—reliability(14 items), box F—hypothesis testing (10items), and box H—criterion validity (7items). After inclusion of studies, thesewere grouped based on the clinimetricproperty assessed, in reliability (intra-,inter-tester and test-retest) and validity(hypothesis testing or criterion validity).
No studies on the responsiveness domainwere obtained.
A compiled list for the assessment ofthe item “minor methodological flaws”as used previously [Larsen et al.,2014] was included (no inclusion of thetarget population, only for reliabilitydomain; only one trial per measure-ment/lack of information on repetition;no random order of investigators/meas-urements; no description of any trainingphase; inadequate description on demo-graphic details). For the item “otherimportant methodological flaws” anadditional list was included (inadequatelydescribed/lacking of information aboutsubject eligibility criteria; doubt regard-ing the site of measurement). Finalscoring of the methodological qualityof each items, evaluated on a four-pointscoring system, (excellent, good, fair, andpoor methodological quality) was basedon the “worse score counts method” inthe checklist [Terwee et al., 2012].
Finally, a best evidence synthesis wasperformed, by compiling the assessmentof the methodological quality, the actualresults of the included studies, thenumber of studies, and the total samplesize, as outlined in connection to theCOSMIN evaluation [Terwee et al.,2007], and as also performed in a previousstudy [Kroman et al., 2014]. The rating ofthe best evidence synthesis ranged fromstrong, moderate, limited, positive/neg-ative, conflicting, or unknown. A notewas made whenever a study was rated“poor” due to only one single item. Inthe reliability domain this mainly con-cerned the item “only one measure-ment” (in box B), as often used in clinicalexaminations and always in questionnairestudies; in the validity domain studieswere mainly rated “poor” due to onerating based on the item “no informationon the measurement properties of thecomparator instrument(s)”; in addition, anote was made to studies rated “poor”due to “subject eligibility criteria inade-quately described/lacking.” Studies rated“poor” due to only one “poor” itemwere upgraded to “fair,” in line withprevious systematic reviews, describingthe limitations of the COSMIN whenused for clinical test assessment methods[Kroman et al., 2014; Larsen et al., 2014].
Figure 1. Flow diagram showing identification, screening, eligibility, and inclusion procedures.
RESEARCH REVIEW AMERICAN JOURNAL OF MEDICAL GENETICS PART C (SEMINARS IN MEDICAL GENETICS) 119
Studieswithmore thanone “poor”-rateditem were omitted from the final evi-dence synthesis.
RESULTS
Identification of ClinicalAssessment Methods
In total 2,285 references were identified,and after removal of duplicates 1,338references were included in the
screening procedure of titles and ab-stracts, of which 298 full-text articleswere eligible according to the inclusioncriteria. In Search 1, a total of sixprimary clinical assessment methods forclassifying GJH were identified, corre-sponding to four test assessment meth-ods (BS, Carter, and Wilkinson [CW],Hospital del Mar [HdM], Rotes-Querol[RQ]), and two questionnaire assess-ment methods (5PQ, Beighton Scoreself-reported [BS-self]). In Search 2, 163
references were identified, and afterremoval of duplicates 33 studies wereidentified describing the clinimetricproperties of the six clinical assessmentmethods (Fig. 1).
Clinimetric Properties
Methodological quality in relation toreliability of the four test assessmentmethods (BS, CW, HdM, and RQ) wasevaluated from eleven studies, and
120 AMERICAN JOURNAL OF MEDICAL GENETICS PART C (SEMINARS IN MEDICAL GENETICS) RESEARCH REVIEW
reliability of one of the two questionnaireassessment methods (5PQ)was evaluatedfrom two studies, while there were noreliability studies on BS-self (Table I).
Methodological quality in relation tovalidity of the four test assessmentmethods (BS, CW, HdM, RQ), wasevaluated fromtwenty studies, andvalidityof the two questionnaire assessmentmethods (5PQ, BS-self) was evaluatedfrom six studies (Table II).
All four tests were rated as havingpoor quality in all reliability studies, while64% (7 of 11) for BS, and 50% (1 of 2) forRQ could be upgraded to fair quality,when one rating (“only one measure-ment”) was omitted. Both studies of 5PQwere rated as having poor quality in test-retest reliability [Moraes et al., 2011;Bulbena et al., 2014], but both could beupgraded to fair. CW, RQ, and HdMwere all rated as having fair quality (3/3),while for BS 82% (14/17) were upgradedand thus rated as having fair quality. Allstudies on validity for the two question-naire assessment methods (for 5PQ: 5/5;for BS-self: 1/1) were upgraded, andtherefore, rated as fair (Table III).
As seen in Tables I and II, all testassessment methods BS, CW, HdM, andRQ, have been tested for reliability, andfor validity when compared with eachother. BS has further been tested fordifferent validity types, such as range ofmotion (ROM), associations with pain,injuries, and other diseases, whereasHdM has further been tested for validityon associations with shoulder injuries.5PQ is the only one of the questionnaireassessment methods that has been testedfor reliability, and for validity whencompared with test assessment methods,associations with pain, diseases, andanxiety. BS-self has only been testedfor validity compared with BS(Table III).
Best Evidence Synthesis: Levels ofEvidence
Test assessment methodsOf the 11 reliability studies for testassessment methods, 5 studies had poorratings on methodological quality[Bulbena et al., 1992; Mikkelsson et al.,1996; Hansen et al., 2002; Boyle et al.,
2003; Aslan et al., 2006], and since theycould not be upgraded to fair, they werenot included in the best evidence synthe-sis. For the only two studies includingintra-rater reliability on BS there waslimited positive evidence [Erkula et al.,2005; Hirsch et al., 2007] (Table IV).
For inter-rater reliability four stud-ies had limited positive evidence [Hickset al., 2003; Erkula et al., 2005; Hirschet al., 2007; Juul-Kristensen et al.,2007], while two studies had negativeevidence [Karim et al., 2011; Jungeet al., 2013], leaving the final evidence aslimited positive to conflicting evidence.A total of four out of the eleven studiesincluded children [Mikkelsson et al.,1996; Hansen et al., 2002; Erkula et al.,2005; Junge et al., 2013].
Validity of BS compared with othertest assessment methods showed limitedpositive to conflicting evidence in threestudies [Bulbena et al., 1992; Ferrariet al., 2005; Junge et al., 2013], whilecompared with ROM (trunk rotation,lower, and upper extremities) the validityin five studies showed limited negative toconflicting evidence (two on children)[Sauers et al., 2001; Erkula et al., 2005;Pearsall et al., 2006; Smits-Engelsmanet al., 2011; Naal et al., 2014]. Thevalidity for BS and the association withpain showed moderate positive to con-flicting evidence in five studies (all onchildren) [El-Metwally et al., 2004,2005, 2007, Tobias et al., 2013; Sohr-beck-Nøhr et al., 2014]. For the validityof BS and the association with injuriesthe validity showed conflicting evidencein three studies (one in children) [Rous-sel et al., 2009; Cameron et al., 2010;Junge et al., 2015]. For the validity of BSand the association with differentdiseases (Temporo-Mandibular Disor-ders, Chronic Fatigue Syndrome, Adhe-sive Capsulitis) there was limited positiveto conflicting evidence in three studies[Nijs et al., 2004; Hirsch et al., 2008;Terzi et al., 2013].
CW (almost similar to the BS), andRQ had unknown evidence for bothinter-rater reliability and validity com-pared with other test assessment methods[Bulbena et al., 1992], while HdMshowedunknown evidence for inter-raterreliability and validity in the association
with injuries, such as anterior shoulderdislocation [Bulbena et al., 1992; Chahalet al., 2010].
Questionnaire assessment methodsFor reliability 5PQ showed conflictingevidence in the two studies [Moraeset al., 2011; Bulbena et al., 2014]. Forthe validity 5PQ showed limited positiveto conflicting evidence compared withtest assessment methods (BS, HdM) inthe same two studies, while in theassociation with pain and tissue diseases(chronic widespread pain, JHS) 5PQshowed limited positive evidence in twostudies [Hakim and Grahame, 2003;Mulvey et al., 2013], and with anxiety itshowed unknown evidence [Sancheset al., 2014]. BS-self showed unknownevidence in the validity compared withBS in one study [Naal et al., 2014].
DISCUSSION
Four test assessment methods (BS, CW,HdM, RQ) and two questionnaireassessment methods (5PQ, BS-self)were identified for classifying GJH, inchildren and adults, with 33 studiesreporting their measurement properties.The four test assessment methods andone of the questionnaire assessmentmethods (5PQ) reported measurementproperties on both reliability and valid-ity. Most studies were on BS, and onlyBS and 5PQ reported aspects of validity.
Four test assessment methods(BS, CW, HdM, RQ) andtwo questionnaire assessmentmethods (5PQ, BS-self) wereidentified for classifying GJH,in children and adults, with33 studies reporting theirmeasurement properties.
The majority of the reliabilitystudies showed limited positive to con-flicting evidence for BS, and thus, mayseem acceptable to be used in clinical
TABLEI.
Inform
ationFrom
theIncluded
StudiesforCOSMIN
Sco
ringin
theReliabilityDomain
Ref/assessm
ent
metho
dStud
ysample/po
pulatio
nRegistratio
n/hand
ling
ofmissingdata
Design
Tim
einterval
Mainresult
Cut-offscore
Metho
dologicalflaws
Beighton(5
items)
Aslanet
al.(2006)
(BHJM
I)(þ
goniom
eter
for5th
finger,elbows,
knees[ly
ing])
n¼72
Asymp.
(meanage20
yrs,
range18–25)
Noinform
ationof
registratio
n/hand
ling
ofmissingdata
Intra-/inter-rater
Intra-rater:
mean12.84
(þ/�
7.41
days)
Inter-rater:
sameday
ICC
(mean)
Intra:0.92
Inter:0.82
%agreem
ent
Intra:86%
(category
scores)
43%
(com
positescores)
Inter:75%(category
scores)
42%(com
positescores)Com
posite
scores
categorized
(0–2:
cat.1)
(3–4:
cat.2)
(5–9:
cat.3)
Other
impo
rtant/minor
metho
dologicalflaws
(inadequately
described/
lackingof
info
abou
tsubject
eligibility
criteria,asym
p.,
norand
omorder)
Boyle
etal.(2003)
(BHJM
I)(þ
goniom
eter
for5th
finger,elbows,
knees[ly
ing])
n¼42
(intra)
n¼36
(inter)
Asymp.
(meanage25
yrs,
range15–45)
Noinform
ationof
registratio
n/hand
ling
ofmissingdata
Intra-
/inter-rater
Intra-rater:
1dayto
2weeks
apart
Inter-rater:sameday
orwith
in6days
%agreem
ent
Intra:81%
Inter:89%
Spearm
ansRho
(con
t.score)
Intra:0.81
Inter:0.87
(P<.0001)
Spearm
ansRho
(cat.scores)
Intra:.86
Inter:0.75
(P<.0001)
Com
posite
scores
categorized
(0–2:
cat.1)
(3–4:
cat.2)
(5–9:
cat.3)
Other
minor
metho
dological
flaws(asymp.,no
rand
omorder,lackingdemog
raph
icdetails)
Bulbena
etal.(1992)
n¼30
Symp.
(n¼20)
(meanage41
yrs)
Con
trol
(n¼10)
(meanage48
yrs)
Info
onmissingdata
provided,no
info
onthehand
lingof
the
data
Inter-rater
Noinform
ation
Kappa
(range)
0.79–0.93
NS
Other
minor
metho
dologicalflaws(lack
ofinfo
onrepetition,
norand
omorder,inadequate
demog
raph
icdetails)
Erkulaet
al.(2005)
n¼50
Asymp.
(children,
meanage10
yrs)
Noinform
ationof
registratio
n/hand
ling
ofmissingdata
Intra-
/inter-rater
Reassessm
ent
after2weeks
Spearm
ansRho
Intra:0.62
Inter:0.86
�7/9
(classified
assig
n.jointlaxity)
Other
impo
rtant/minor
metho
dologicalflaws
(sub
ject
eligibility
criteriainadequately
described,
asym
p.,
norand
omorder,no
training
phase,
noblinding
ofresults)
continued
RESEARCH REVIEW AMERICAN JOURNAL OF MEDICAL GENETICS PART C (SEMINARS IN MEDICAL GENETICS) 121
TABLEI.
(Continued)
Ref/assessm
ent
metho
dStud
ysample/po
pulatio
nRegistratio
n/hand
ling
ofmissingdata
Design
Tim
einterval
Mainresult
Cut-offscore
Metho
dologicalflaws
Hansenet
al.(2002)
(4/5
selected
tests
includ
ed,no
5th
finger)
n¼100
Asymp.
(children,
9–13
yrs)
Info
onmissingdata
provided,the
hand
lingof
thedata
canbe
dedu
ced
Inter-rater
Noinform
ation
Kappa
(range)
4tests:
0.44–0.82
(experts)
�0.40
(inexperienced/
parents)
NS
Other
minor
meth.
flaws
(asymp.,no
training
phase,
inadequate
demog
raph
icdetails,no
blinding)
Hicks
etal.(2003)
n¼63
Symp.
(meanage36
yrs,
range20-66)
Noinform
ationof
registratio
n/hand
ling
ofmissingdata
Inter-rater
15min
time-delay
betweenraters
ICC
(mean)
Pair1:
0.95
Pair2:
0.76
Pair3:
0.66
NS
Other
impo
rtant/minor
meth.
flaws
(inadequately
described/lackingof
info
abou
tsubject
eligibility
criteria,no
rand
omorder,
inadequate
demog
raph
icdetails)
Hirschet
al.(2007)
(+goniom
eter
forelbows,
knees)
n¼50
Asymp.
(meanage38
yrs,
range20–60)
(recruitedfrom
ageneraldental
practicesetting)
Info
onmissing
data
provided
(1subject),no
info
onthehand
lingof
thedata
Intra-/inter-rater
Anaverageof
24.6
days
betw
eenfirst
andfollow-up
exam
ICC
(mean)
Intra:>0.89
Inter:>0.84
Cronb
ach’salph
a(in
tra-
andinter-rater
agreem
ent)
Mean0.75
Median0.77
�4/9
Other
minor
meth.
flaws
(asymp.,no
rand
omorder)
Jungeet
al.(2013)
(2different
metho
ds)
n¼39
Asymp.
(schoo
lchildren,
aged
7–8and
10–12
yrs)
Noinform
ationof
registratio
n/hand
lingof
missing
data
Inter-rater
Approx.
30min
betweentesting
sessions
%agreem
ent
74–97%
(metho
dA)
72–97%
(metho
dB)
�5/9:82%
(A),80%
(B)
Kappa
(range)
0.49–0.94
(metho
dA)
0.30–0.84
(metho
dB)
�5/9:0.64
(A),0.59
(B)
�5/9
Noothermetho
dological
flaws
Juul-K
ristensenet
al.
(2007)
n¼40
Symp.
(meanage34
yrs)
Asymp.
Noinform
ationof
registratio
n/hand
lingof
missing
data
Inter-rater
Noinform
ation
Kappa
(range)
0.34–1.00
(curr)
0.60–1.00
(curr/hist)
�5/9:0
.66(curr),0.74
(curr/hist)
�5/9
Noothermetho
dological
flaws
continued
122 AMERICAN JOURNAL OF MEDICAL GENETICS PART C (SEMINARS IN MEDICAL GENETICS) RESEARCH REVIEW
TABLEI.
(Continued)
Ref/assessm
ent
metho
dStud
ysample/po
pulatio
nRegistratio
n/hand
ling
ofmissingdata
Design
Tim
einterval
Mainresult
Cut-offscore
Metho
dologicalflaws
(meanage46
yrs)
ICC
(mean):0.91(curr/
hist)
Karim
etal.(2011)
n¼30
Con
tempo
rary
Pro
dancers(m
eanage
24yrs,range
18–32)
Noinform
ationof
registratio
n/hand
lingof
missing
data
Inter-rater
Noinform
ation/
guidelineavailable
upon
requ
est
%agreem
ent
54–100%
Kappa
(mean)
0.60
NS
Other
minor
metho
dologicalflaws
(norand
omorder,no
descriptionof
blinding
ofresults)
Mikkelsson
etal.(1996)
n¼29
(inter)
n¼13
(intra)
Asymp.
scho
olchildren(m
eanages9
and11
yrs)
Noinform
ationof
registratio
n/hand
lingof
missing
data
Intra-/inter-rater
Intrarater:with
inthe
samelesson
(atthe
beginn
ingandat
theend)
Inter-
rater:du
ring
the
samelesson
Kappa
(mean)
Inter:0.78
Intra:0.75
ICC
(mean)
Inter:0.80
Intra:0.84
�6/9
Other
minor
metho
dologicalflaws
(asymp.,no
rand
omorder,no
training
phase,
inadequate
demog
raph
icdetails)
Carter&
Wilkinson(5
items,score0–5)
Bulbena
etal.(1992)
n¼30
Symp.
(n¼20)(m
ean
age41
yrs)
Con
trols(n¼10)
(meanage48
yrs)
Info
onmissingdata
provided,no
info
onthehand
lingof
thedata
Inter-rater
Noinform
ation
Kappa
(range)
0.68–0.92
NS
Other
minor
metho
dologicalflaws
(lack
ofinfo
onrepetition,
norand
omorder,inadequate
demog
raph
icdetails)
HospitaldelMar
(9/10items,score0–9)
Bulbena
etal.(1992)
(item
testing)
n¼30
Symp.
(n¼20)
(meanage41
yrs)
Con
trols(n¼10)
(meanage48
yrs)
Info
onmissingdata
provided,no
info
onthehand
lingof
thedata
Inter-rater
Noinform
ation
Kappa
(range)
0.61–1.00
NS
Other
minor
metho
dologicalflaws
(lack
ofinfo
onrepetition,
norand
omorder,inadequate
demog
raph
icdetails)
Rot�es-Q
u�erol(11items,score0–11)
Bulbena
etal.(1992)
n¼30
Symp.
(n¼20)
(meanage41
yrs)
Con
trols(n¼10)
(meanage48
yrs)
Info
onmissingdata
provided,no
info
onthehand
lingof
thedata
Inter-rater
Noinform
ation
Kappa
(range)
0.44–0.93
NS
Other
minor
metho
dologicalflaws
(lack
ofinfo
onrepetition,
norand
omorder,inadequate
demog
raph
icdetails)
Juul-K
ristensenet
al.
(2007)
n¼40
Symp.
Noinform
ationof
registratio
n/hand
lingof
missing
Inter-rater
Noinform
ation
Kappa
(range)
0.32–0.79
(curr)
0.31–0.80
(curr/hist)
Noothermetho
dological
flaws
continued
RESEARCH REVIEW AMERICAN JOURNAL OF MEDICAL GENETICS PART C (SEMINARS IN MEDICAL GENETICS) 123
TABLEI.
(Continued)
Ref/assessm
ent
metho
dStud
ysample/po
pulatio
nRegistratio
n/hand
ling
ofmissingdata
Design
Tim
einterval
Mainresult
Cut-offscore
Metho
dologicalflaws
(3items)
(meanage34
yrs)
Asymp.
(meanage46
yrs)
data
ICC
(mean):0.83
(curr/hist)
5-partqu
estio
nnaire
(5items,score0–5)
Bulbena
etal.(2014)
5-partqu
estio
nnaire
(þ2qu
estio
ns)
n¼33
Symp.
(anx
iety)
(meanage35
yrs)
Noinform
ationof
registratio
n/hand
lingof
missing
data
Test-retest
1week
Tau-kendallindex:
0.91
ICC
(mean):0.96
�3/7
Other
minor
metho
dologicalflaws
(norand
omorder,
inadequate
description
ondemog
raph
icdetails)
Moraeset
al.(2011)
5-partqu
estio
nnaire
n¼211
Asymp.
(ages17–24
yrs)
Info
onmissingdata
implicitprovided,
thehand
lingof
the
data
canbe
dedu
ced
Intra-grou
pagreem
ent
6mon
ths
Kappa
(mean)
Q1:
0.63
Q2:
0.70
Q3:
0.65
Q4:
0.57
Q5:
0.48
�2/5
Other
minor
metho
dologicalflaws
(asymp.,no
rand
omorder,inadequate
demog
raph
icdetails,no
blinding)
BHJM
I,BeightonandHoran
JointMob
ility
Index;
NS,
notstated;approx.,approximately,Intra,intrarater;Inter,interrater;yrs,years;asym
p.,asym
ptom
atic;symp.,sym
ptom
atic;
Q,q
uestion;
cont.,continuo
us;cat.,
category;curr,currently;h
ist,h
istorically;ICC,intraclasscorrelation;
sign.,sig
nificant;rho,
rank
correlationco-efficient.
124 AMERICAN JOURNAL OF MEDICAL GENETICS PART C (SEMINARS IN MEDICAL GENETICS) RESEARCH REVIEW
practice, provided that uniformity oftesting procedures is included in testingprocedures, in addition to historicalinformation, especially in adults. How-ever, there are shortcomings on studiesfor the validity of BS, while the threeother test assessment methods lackinformation on both reliability andvalidity. For the questionnaire assess-ment methods, 5PQ was the mostfrequently used, however, only in adultpopulation studies, and the reliabilityshowed conflicting evidence. Concern-ing the validity there were shortcomingson studies for 5PQ, while for BS-self thevalidity showed unknown evidence incomparison with BS. More studies areneeded to conclude on the measure-ment properties for BS-self.
Inter-rater reliability studies on testassessment methods were most fre-quently reported on BS, with themajority showing limited positive toconflicting evidence, and thus, may beacceptable for this assessment method.This may provide useful information forclinicians and researchers, in order toestablish uniformity in carrying out theprocedures. On the other hand, it alsoshows the need for more future com-prehensive studies of this test assessmentmethod, since unclear/vague or differ-ent descriptions of the procedures forperforming the Beighton tests were used(e.g., thumbs apposition with straight orflexed elbow, knee extension in standingor supine lying). The procedures ini-tially illustrated by photos for perform-ing the Beighton tests [Beighton et al.,1973] are recommended for futureclinical use, as described in detail inthe appendix of one of the reliabilitystudies [Juul-Kristensen et al., 2007], asthey have satisfactory reliability.
Some of the studies did not includeall nine tests as recommended, whichespecially is important when definingcut-points for classifying GJH, as dis-cussed further below. Other test assess-ment methods, such as CW, RQ, andHdM showed unknown evidence onreliability in one single study [Bulbenaet al., 1992], which is too limited toconclude on.
For the questionnaire assessmentmethods, most of the studies were
TABLEII.Inform
ationFrom
theIncluded
StudiesforCOSMIN
Sco
ringin
theValidityDomain
Ref/assessm
ent
metho
dYear
Stud
ysample/po
pulatio
nRegistratio
n/hand
ling
ofmissingdata
Con
struct
validity
hypo
theses
form
ulated/directio
n/criterion
validity
(adequ
ate“gold
standard”)
Mainresults
Cut-off
points
Metho
dologicalflaws
(explicitlystated
orlack
ofinform
ation)
BSandothertests/different
BSprocedures
Bulbena
etal.
•BS
•CW
(5items)
•RQ
(11items)
1992
n¼114
(Sym
p.)
n¼59
(con
trols)
Info
onmissingdata
provided,the
hand
lingof
data
can
bededu
ced
Agreementbetw
eenBS,
CW,
andRQ
criteriato
defin
ethe
HMS
Multip
lehypo
theses
form
ulated
apriori
Pearsoncorrelation
BSvs.RQ:0.89
BSvs.CW:0.97
CW
vs.RQ:0.86%
agreem
ent
CW
�3:96–98%
BS�5
:96–98%
RQ
�5:95–96%
Different
cutpoints
aretested
Other
impo
rtant/minor
meth.
flaws(sub
ject
eligibility
criteria
inadeq.described/
lackingforthecontrol
grou
p,1trialper
measure.,no
rand
omorder,no
training
phase,
inadequate
demog
raph
icdetails,
noblinding)
Ferrariet
al.
•BS
•Low
erLimb
Assessm
ent
Score(LLA
S)(12items)
2005
1)n¼21
(hyp.)
2)n¼88
(possib
lehyp.)
3)n¼116
(con
trols)
(range:5-16
yrs)
Noinfo
onmissing
data/handling
provided
Agreementbetw
eenBS
andLL
AS
Hypothesesvague/no
tform
ulated,bu
tpo
ssible
todedu
cewhatwas
expected
%agreem
ent
BSvs.L
LAS69%
(1),80%
(3)
Pearsoncorrelation0.43
(1),
0.79
(2),0.79
(3)
�5/9
(BS)
�7/12
(LLA
S)
Other
minor
metho
dologicalflaws
(1trialpermeasure.
session,
norand
omorder,no
training
phase,
inadequate
demog
raph
icdetails,
noblinding)
Jungeet
al.
•BSmetho
dA
•BSmetho
dB
2013
n¼103
Asymp.
(schoo
lchildren,
aged
7–8and
10–12
yrs)
Info
onmissingdata
provided,the
hand
lingof
thedata
canbe
dedu
ced
Agreementbetw
eenpreval.of
GJH
ofMetho
dsA
andB
Hypothesesvague/no
tform
ulated,bu
tpo
ssible
todedu
cewhatwas
expected
McN
emar
sign
prob
ability
test
Prevalence
onMetho
dA
andBNodifference
(P¼0.54)
�5/9
(BS
AþB)
Noothermeth.
flaws
BSandROM
Erkulaet
al.
•BS
2005
n¼1273
Noinfo
onmissing
data/handling
Agreementbetw
eenjointlaxity
andtrun
krotatio
nChi2-test
BSvs.scap.
�7/9
(BS)
Other
impo
rtant/minor
meth.
flaws co
ntinued
RESEARCH REVIEW AMERICAN JOURNAL OF MEDICAL GENETICS PART C (SEMINARS IN MEDICAL GENETICS) 125
TABLEII.(Continued)
Ref/assessm
ent
metho
dYear
Stud
ysample/po
pulatio
nRegistratio
n/hand
ling
ofmissingdata
Con
struct
validity
hypo
theses
form
ulated/directio
n/criterion
validity
(adequ
ate“gold
standard”)
Mainresults
Cut-off
points
Metho
dologicalflaws
(explicitlystated
orlack
ofinform
ation)
•Trunk
rotatio
n/asym
metry
Asymp.
(meanage10.4,
range8–15
yrs)
provided
Minim
alhypo
theses
form
ulated
apriori
asym
metry:
(P¼0.008)
BSvs.left
scap.elevation:
(P¼0.028)
(inadequately
describedsubject
eligibility
criteria,
only
1trialper
measure.session,
norand
omorder,no
training
phase,
inadequate
demograph
icdetails,
noblinding)
Naalet
al.
•BS
•Fem
oroacetabu
lar
impingem
ent
(FAI)
2014
n¼55
Symp.
(meanage28.5
yrs,SD
4.1yrs)
Noinfo
onmissing
data/handling
provided
Agreementbetw
eenBSand
Hip
ROM
Hypothesesvague/no
tform
ulated,bu
tpo
ssible
todedu
cewhatwas
expected
Spearm
anscorrelation
BSvs.
Hipflex:
0.61
Int.rot:0.56
Ext.rot:0.44
(allwith
P<0.01)
�4/9
(BS)
�6/9
(BS)
Other
minor
metho
dologicalflaws
(only1trialper
measurementsession,
norand
omorder,no
training
phase,
noblinding)
Pearsallet
al.
•BS
•Knee
arthrometer
(KT-2000)•A
nkle
arthrometer
2006
n¼57
Asymp.
(meanage20.9
yrs,SD
1.45
yrs)
Noinfo
onmissing
data/handling
provided
Agreementbetw
eenBSand
knee
andanklejointspecific
laxity
Hypothesesvague/no
tform
ulated,bu
tpo
ssible
todedu
cewhatwas
expected
Spearm
anscorrelation
BSvs.
Kneelax:0.37
(P¼0.110)
A/P
anklelax:
0.21
(P¼0.152)
Int-Ext
rot
anklelax:
0.24
(P¼0.101)
NS
Other
minor
meth.
flaws
(norand
omorder,no
training
phase,
noblinding)
Sauerset
al.
•BS(4/5
items,
forw
ardflexion
notinclud
ed)
•Sho
ulder
2001
n¼51/102
Asymp.
(health
yshou
lders)
Noinfo
onmissing
data/handling
provided
Agreementbetw
eenBSand
glenoh
umeraljointlaxity
Hypothesesvague/no
tform
ulated,bu
tpo
ssible
todedu
cewhatwas
expected
Pearsons
correlation
Bsvs.instr.
APlaxscore:
(0.23,
NS)
Clin
passROM:
NS
(BS0–8)
assumed)
Other
minor
metho
dologicalflaws
(only1trialper
measure.session,
norand
omorder,no
continued
126 AMERICAN JOURNAL OF MEDICAL GENETICS PART C (SEMINARS IN MEDICAL GENETICS) RESEARCH REVIEW
TABLEII.(Continued)
Ref/assessm
ent
metho
dYear
Stud
ysample/po
pulatio
nRegistratio
n/hand
ling
ofmissingdata
Con
struct
validity
hypo
theses
form
ulated/directio
n/criterion
validity
(adequ
ate“gold
standard”)
Mainresults
Cut-off
points
Metho
dologicalflaws
(explicitlystated
orlack
ofinform
ation)
arthrometer
•Clin
icalpassive
ROM
(meanage22
yrs,
SD2.8yrs)
(0.01–0.48,NS)
training
phase,
inadequate
demograph
icdetails,
noblinding)
Smits-Engelsm
anet
al.
•BS
•Gon
iometer
2011
n¼551
Asymp.
(meanage8yrs,
range6–12
yrs)
Noinfo
onmissing
data
provided,bu
tthehand
lingof
data
canbe
dedu
ced
Agreementbetw
eenBS
and16
ROM
Hypothesesvague/no
tform
ulated,bu
tpo
ssible
todedu
cewhatwas
expected
Variance
analysis
Sign.diffin
mean
ROM
betw
een
the3BSgrou
ps(P
<0.001,
except
knee
flexP¼0.02;
hipextP¼0.06)
BS: 0–4(not
hyp)
5–6(in
cr.
hyp)
7–9(hyp)
Other
minor
metho
dologicalflaws
(only1trialper
measure.session,
norand
omorder,no
blinding)
BSandpain
El-Metwally
etal.
•BS
•Musculoskeletal
pain
2004
n¼430
Asymp.
(meanage9.8
and11.8
yrs)
Info
onmissingdata,
andhand
lingof
data
provided
Predictivebaselin
efactorsfor
persistence/recurrenceof
MSK
pain,from
childho
odtilladolescence
Hypothesesv
ague
anddirection
notform
ulated,bu
tpo
ssible
todedu
cewhatw
asexpected
Pain
recurrence
4-yr
follow-up
(GLM
odelsþ
Risk
Ratio):
(RR¼1.35
[1.08–1.68])
�6/9
Other
minor
metho
dologicalflaws
(only1trialper
measure,no
rand
omorder,no
training
phase,
inadequate
demograph
icdetails)
El-Metwally
etal.
•BS
•Low
erlim
bpain
(LLP
)
2005
n¼1284
Asymp.
(meanage9.8
and11.8
yrs)
Noinfo
onmissing
data
provided,bu
tthehand
lingof
data
canbe
dedu
ced
Predictivebaselin
efactorsfor
persistence/recurrenceof
LLP,from
childho
odtilladolescence
Hypothesesv
ague
anddirection
notform
ulated,bu
tpo
ssible
todedu
cewhatw
asexpected
LLPrecurrence
4-yr
follow-up
(GLM
odelsþ
OddsR
atio)
(OR¼2.93
[1.13–7.70])
�6/9
Other
minor
metho
dologicalflaws
(only1trialper
measure,no
rand
omorder,inadequate
demograph
icdetails)
El-Metwally
etal.
•BS
•Musculoskeletal
pain
2007
n¼1113
Asymp.
(meanage9.8
and11.8
yrs)
Info
onmissingdata
provided,the
hand
lingof
data
can
bededu
ced
Predictivebaselin
efactorsfor
incidenceofmusculoskeletal
pain,from
childho
odtill
adolescence
Hypothesesv
ague
anddirection
Incidence4-yr
follow-up
(GLM
odelsþ
OddsR
atio)
BSvs.Non
-traum
.pain
(OR:0.83[0.44–1.56])
BSvs.Traum
atic
pain
�4/9
�6/9
Other
minor
metho
dologicalflaws
(only1trialper
measure,no
rand
omorder,inadequate
continued
RESEARCH REVIEW AMERICAN JOURNAL OF MEDICAL GENETICS PART C (SEMINARS IN MEDICAL GENETICS) 127
TABLEII.(Continued)
Ref/assessm
ent
metho
dYear
Stud
ysample/po
pulatio
nRegistratio
n/hand
ling
ofmissingdata
Con
struct
validity
hypo
theses
form
ulated/directio
n/criterion
validity
(adequ
ate“gold
standard”)
Mainresults
Cut-off
points
Metho
dologicalflaws
(explicitlystated
orlack
ofinform
ation)
notform
ulated,bu
tpo
ssible
todedu
cewhatw
asexpected
(OR:0.70
[0.16–3.04])NS
demog
raph
icdetails)
Sohrbeck-N
øhret
al.
•BS
•Pain(arthralgia)
2014
n¼301
Asymp.
(median
age14
yrs,range
13-15yrs)
Info
onmissingdata
provided,described
hand
ling
Predictivebaselin
efactorsfor
incidenceof
arthralgia,from
childho
odtilladolescence
Hypothesesv
ague
anddirection
notform
ulated,bu
tpo
ssible
todedu
cewhatw
asexpected
Arthralgiaincidence4and
6-yr
follow-up(Logistic
RegressMod
elsþ
Odds
Ratio)
BS�5
/9(O
R:3.00,[0.94–9.60])
�4/9
�5/9
�6/9
Other
minor
metho
dologicalflaws
(norand
omorder)
Tobias
etal.
•BS
•Musculoskeletal
pain
2013
n¼2901
Asymp.
(meanage
13.8
yrs)
Info
onmissingdata
provided,the
hand
lingof
data
can
bededu
ced
Predictivebaselin
efactorsfor
incidenceof
arthralgia,from
childho
odtilladolescence
Hypothesesv
ague
anddirection
notform
ulated,bu
tpo
ssible
todedu
cewhatw
asexpected
Pain
association4-yr
follow-up(Logistic
Regress
Mod
elsþ
OddsRatio)
Shou
lder
(OR1.68
[1.04–2.72])Knee
(OR
1.83
[1.10–3.02])
Ank
le/foo
t(O
R1.82
[1.05 –3.16])
�6/9
Other
minor
metho
dological
flaws(on
ly1trialper
measure,no
rand
omorder,no
blinding)
BSandinjuries
Cam
eron
etal.
•BS
•Gleno
humeral
instability
(GI,
historic
info)
2010
n¼714(soldiers,
meanage18.8
yrs,SD
1yr)
Info
onmissingdata
provided,described
hand
ling
Relationshipbetw
eenBSand
GIHypothesesvagueand
directionno
tform
ulated,b
utpo
ssible
todedu
cewhatwas
expected
GIassociation(Log
istic
RegressMod
elsþ
Odds
Ratio)
(OR
2.48
[1.19–5.20])
�2/9
Other
minor
metho
dologicalflaws:
(1trialpermeasure,
norand
omorder)
Chahalet
al.
•HdM
tests
(10items)þext.
rot>85°
(GLL
þER)
•Primaryanterior
shou
lder
dislo
catio
n(ASD
)
2010
n¼57(cases)/
92(con
trols)
(meanage24
yrs,
SD:0.32)
Noinfo
onmissing
data
provided,the
hand
lingof
data
can
bededu
ced
Predictiverisk
factor
ofGLL
þER
forAID
Minim
alnu
mberof
hypo
theses
form
ulated,expected
directionof
correlationstated
AID
association
(OddsRatio)
All(O
R:2.79
[1.27–6.09])
GLL
(OR:3.6
[1.49–8.68])
GLL
þER
Men:(O
R:7.43
[2.13–25.57])
Men:
�4/10
Wom
en:
�5/10
Other
minor
metho
dologicalflaws
(1trialpermeasure,
norand
omorder,
inadequate
demog
raph
icdetails,
noblinding) continued
128 AMERICAN JOURNAL OF MEDICAL GENETICS PART C (SEMINARS IN MEDICAL GENETICS) RESEARCH REVIEW
TABLEII.(Continued)
Ref/assessm
ent
metho
dYear
Stud
ysample/po
pulatio
nRegistratio
n/hand
ling
ofmissingdata
Con
struct
validity
hypo
theses
form
ulated/directio
n/criterion
validity
(adequ
ate“gold
standard”)
Mainresults
Cut-off
points
Metho
dologicalflaws
(explicitlystated
orlack
ofinform
ation)
GLL
(OR:6.75
[1.92–23.36])
Jungeet
al.
•BS;
Knee
hyperm
obility
(KH)
•Kneeinjuries
bySM
S-track(KI)
2015
n¼999
(schoo
lchildren,
9–14
yrs)
Info
onmissingdata
provided,described
hand
ling
Predictiverisk
factor
ofBSþKH
forKIMinim
alnu
mberof
hypo
theses
form
ulated,expected
directionof
correlationstated
KIassociation
(Logistic
Regress
mod
elsþ
OddsRatio)
Traum
atic
Inj(O
R:1.56
[0.43–5.61])BS
Traum
aticInj(O
R:2.22
[0.60–8.19])BSþKH
�5/9
Other
minor
metho
dologicalflaws
(1trialpermeasure,
norand
omorder,no
training
phase)
Rou
ssel
etal.
•BSþgoniom
eter
•Pelvicinjuries
(PI),
low
back
pain
(LBP)
2009
n¼32
(dancers)
n¼26
(students)
(mean20
yrs,
SD:2yrs)
Info
onmissingdata
provided,described
hand
ling
Predictiverisk
factor
ofBSfor
PIandLB
PHypothesesv
ague
anddirection
notform
ulated,bu
tpo
ssible
todedu
cewhatw
asexpected
PI/L
BP(Spearman
correlation:
(Rho
¼�0.03;
P¼0.89)PI,LB
P
BS: 0–3(tight)
4–6(hyp)
7–9(extr.
hyp)
Other
impo
rtant/minor
meth.
flaws(inadequately
describedsubject
eligibility
criteria,1
trialpermeasure,no
rand
omorder,no
training
phase,
inadequate
demog
raph
icdetails,
noblinding)
BSanddiseases
Hirschet
al.
•BS
•Tem
poro-
mandibu
lar
sympt
(TMs)
2008
n¼895
(meanage
40.6
yrs,SD
:11.6
yrs)
Noinfo
onmissing
data
provided,bu
tthehand
lingof
data
canbe
dedu
ced
Agreementbetw
eenBS
andTMs
Hypothesesvagueand
directionno
tform
ulated,b
utpo
ssible
todedu
cewhatwas
expected
Associatio
n(M
ultip
lelog.
Regress.
Mod
elsþ
OddsRatio)
Jointclick,
redu
ced
ROM
(OR:1.56
[1.01–2.39])
Category
0(BS¼0)
1(BS¼
1–3)
2(BS¼4–9)
Other
minor
metho
dologicalflaws
(only1trialper
measure.session,
norand
omorder,no
training
phase,
inadequate
continued
RESEARCH REVIEW AMERICAN JOURNAL OF MEDICAL GENETICS PART C (SEMINARS IN MEDICAL GENETICS) 129
TABLEII.(Continued)
Ref/assessm
ent
metho
dYear
Stud
ysample/po
pulatio
nRegistratio
n/hand
ling
ofmissingdata
Con
struct
validity
hypo
theses
form
ulated/directio
n/criterion
validity
(adequ
ate“gold
standard”)
Mainresults
Cut-off
points
Metho
dologicalflaws
(explicitlystated
orlack
ofinform
ation)
demograph
icdetails,
noblinding)
Nijs
etal.
•BS
•Chron
icfatig
uesynd
rome(C
FS)
2004
n¼44
Symp.
(CFS
)(m
eanage40.2
yrs,SD
:9.11)
Info
onmissingdata
provided,bu
tthe
hand
lingof
data
can
bededu
ced
Associatio
nbetween
widespreadpain
inpatients
with
CFS
andBS
Multip
lehypo
theses
form
ulated
apriori,d
irectio
nandbu
tmagnitude
notstated
Associatio
nbetw
eenBSand
CFS CFS
:63.6%
Non
-CFS
:36.5%
(Fischersexacttest:
P¼0.73)
CFS
-APQ
1:P¼0.75
CFS
-APQ
2:P¼0.69
(NS)
�4/9
Other
minor
metho
dologicalflaws
(only1trialper
measure,no
rand
omorder,no
training
phase,
inadequate
demograph
icdetails,
noblinding)
Terziet
al.
•BS
•Adh
esive
capsulitis(AC)
oftheshou
lder
•Sub
acromial
impingem
ent
synd
rome(SIS)
2013
n¼240
(120
with
AC,
120controls/SIS)
(meanage
53.9–54.08yrs,
SD:8
.21–10.68)
Noinfo
onmissing
data
provided,bu
tthehand
lingof
data
canbe
dedu
ced
Difference
inGJH
(BS)
inAC
vs.SIS
Minim
alnu
mberof
hypo
theses
form
ulated,expected
direction,
butno
tmagnitude
ofcorrelationstated
Chi-squ
are
GJH
inAC
vs.SIS:
(0.8%
vs.7.5%
,P¼0.010)
NS
Other
minor
metho
dologicalflaws
(only1trialper
measure,no
rand
omorder,no
training
phase,
noblinding)
Questionn
aires(5-part)andothertests
Bulbena
etal.
•5PQ
(þ2
questio
ns)
•HdM
(10items)
2014
n¼191pt’s
(potential
anxiety,mean
age35.5
yrs,
Noinfo
onmissing
data/handling
provided
Agreementbetw
een5P
Qþ2Q
andHDM
Hypothesesvagueanddirection
notform
ulated,bu
tpo
ssible
todedu
cewhatw
asexpected
Spearm
anscorrelation
5PQþ2
QandHdM
:(rho
¼0.75;P<0.001)
5PQþ2Q
and5P
Q:
(rho
¼0.86;P<0.001)
�3/7
Other
impo
rtant/minor
meth.
flaws(in
adeq.
describedsubject
eligibility
criteria,1
trialpermeasure,no
continued
130 AMERICAN JOURNAL OF MEDICAL GENETICS PART C (SEMINARS IN MEDICAL GENETICS) RESEARCH REVIEW
TABLEII.(Continued)
Ref/assessm
ent
metho
dYear
Stud
ysample/po
pulatio
nRegistratio
n/hand
ling
ofmissingdata
Con
struct
validity
hypo
theses
form
ulated/directio
n/criterion
validity
(adequ
ate“gold
standard”)
Mainresults
Cut-off
points
Metho
dologicalflaws
(explicitlystated
orlack
ofinform
ation)
SD:10.18yrs)
rand
omorder,no
training
phase,
inadeq.
demog
raph
icdetails,
noblinding)
Moraeset
al.
•5PQ
(transl.)
•BS
2011
n¼394
(university
stud
ents,age
17–24
yrs)
Info
onmissingdata
provided,the
hand
lingcanbe
dedu
ced
Agreementbetw
een5P
QandBS
Hypothesesv
ague
anddirection
notform
ulated,bu
tpo
ssible
todedu
cewhatw
asexpected
Assum
able
that
thecriterion
used
(BS)
canbe
considered
areason
able
“gold
standard”
%agreem
ent73.6%
Sensitivity
70.9
(62.1–78.6)
Specificity
77.4
(71.4–82.6)
�2/5
(5PQ
)
�4/9
(BS)
Other
minor
metho
dologicalflaws
(norand
omorder,
inadequate
demog
raph
icdetails,
noblinding)
5-partQ
andpain
Mulveyet
al.
•5-part
questio
nnaire
•Chron
icwidespreadpain
(CWP)
2013
n¼12,354
(pop
ulation,
medianage55
yrs,range
25–107yrs)
Info
onmissingdata
provided,
registratio
n/hand
lingdescribed
Associatio
nbetw
eenJH
and
musculoskeletalpain
Minim
alnu
mberof
hypo
theses
form
ulated,expected
directionof
correlationstated
Relativerisk
ratio
JHandCWP
gradeI:
(RRR
1.2;
P¼0.03
–mod
estass.)
gradeII:
(RRR
1.2,
P¼0.1–mod
estass.,
notstat.sig
n.)
gradeIII/IV:
(RRR
1.4;
P¼<0.001)
�2/5
Other
minor
metho
dologicalflaws
(1trialpermeasure.,
norand
omorder,no
training
phase,
inadequate
demog
raph
icdetails,
noblinding)
5-partQ
anddiseases
Hakim
andGrahame
•5PQ
2003
n¼269
(212
BJH
S,57
controls)
n¼220
(170
BJH
S,50
controls)
(range
Noinfo
onregistratio
n/hand
lingof
missing
data
Agreementbetw
een5P
QandBJH
SHypothesesvagueanddirection
notform
ulated,bu
tpo
ssible
todedu
cewhat
Sens.(mean)
77-85%
(1stand
2ndcoho
rt)Spec.80–89%
(1stand2n
dcoho
rt)
�2/5
Other
minor
metho
dologicalflaws
(1trialpermeasure.,
norand
omorder,no
training
phase, continued
RESEARCH REVIEW AMERICAN JOURNAL OF MEDICAL GENETICS PART C (SEMINARS IN MEDICAL GENETICS) 131
TABLEII.(Continued)
Ref/assessm
ent
metho
dYear
Stud
ysample/po
pulatio
nRegistratio
n/hand
ling
ofmissingdata
Con
struct
validity
hypo
theses
form
ulated/directio
n/criterion
validity
(adequ
ate“gold
standard”)
Mainresults
Cut-off
points
Metho
dologicalflaws
(explicitlystated
orlack
ofinform
ation)
16–80
yrs)
was
expected
Assum
able
that
thecriterion
used
(BS)
canbe
considered
areason
able
“gold
standard”
inadequate
demograph
icdetails,
noblinding)
Sancheset
al.
•5PQ
•Anx
iety
2014
n¼2300
(university
stud
ents,mean
age21
yrs,
SD:3.25)
Info
onmissingdata
provided,
registratio
n/hand
lingcanbe
dedu
ced
Associatio
nbetw
eenBeck
Anx
iety
Inventory(BAI)
and5P
QHypothesesvagueanddirection
notform
ulated,bu
tpo
ssible
todedu
cewhatw
asexpected
Pearsons
correlation
5PQ
andBAI
totalscore
(Wom
en:r¼
0.11,
P¼0.007,
weakass.)
(Men:r¼
0.04,
(P¼0.450),
notsig
n.ass.)
�2/5
Other
minor
metho
dologicalflaws
(1trialper
measurement,no
rand
omorder,no
training
phase,
inadequate
demograph
icdetails,
noblinding)
BSself-repo
rted
Naalet
al.
•BS
•BS-self(from
paperdraw
ings)
•Fem
oroacetabu
lar
impingem
ent
(FAI)
2014
n¼55
Symp.
(diagnosed
with
FAI)
(meanage
28.5
yrs,
SD:4.1yrs)
Info
onmissingdata
provided,
registratio
n/hand
lingcanbe
dedu
ced
Agreementbetw
eenBS-self
andBS/agreem
entbetw
een
BSandHip
ROM
Hypothesesvagueanddirection
notform
ulated,bu
tpo
ssible
todedu
cewhatw
asexpected
Kappa-values(L
w)
Total0.82
(0.72–0.91)
Single
testsrange:
0.61–0.96
Spearm
anscorrelationBSvs.
Hipflex:
0.61
Int.rot:0.56Ext.rot:0.44
(allwith
P<0.01)
�4/9
(BS)
�6/9
(BS)
Other
minor
metho
dologicalflaws
(1trialpermeasure,
norand
omorder,no
training
phase,
noblinding)
BS,BeightonScale;CW,C
arter&Wilkinson;
HDM,H
ospitaldelMar;R
Q,R
ot�es-Q
u�erol;LL
AS,Lo
wer
LimbAssessm
entS
core;5PQ
,5-partq
uestionn
aire;m
easure,m
easurement;
meth,
metho
dological;hyp,hyperm
obile;G
JH,generalized
jointh
ypermob
ility;H
MS,hyperm
obility
synd
rome;GLL
,generalized
ligam
entous
laxity;JH,joint
hyperm
obility;B
JHS,
benign
jointhypermob
ilitysynd
rome;MSK
,musculoskeletal;scap,scapular;lax,laxity
;ROM,range
ofmotion;clin,clin
ical;pass,passive;incr,increased;diff,difference;sign,sig
nificant;
ass,association;traum,traum
atic;transl.,translatio
n;inadeq,inadequ
ate;ER,externalrotation;ASD
,Primaryanterior
shou
lderdislo
catio
n;KH,kneehyperm
obility;K
I,kn
eeinjuries;P
I,pelvic
injuries;LB
P,low
back
pain;TMs,Tempo
romandibu
larsymptom
s;CFS,chronicfatig
uesynd
rome;
AC,adhesiv
ecapsulitis;SIS,
subacrom
ialim
pingem
entsynd
rome;
CWP,
chronicwidespreadpain;BAI,BeckAnx
iety
Inventory;
FAI,femoroacetabu
larim
pingem
ent;A/P,anterior/posterior;SD
,standard
deviation;
OR,od
dsratio
;RR,risk
ratio
;sens,
sensitivity;spec,specificity;NS,
non-sig
nificant;GLm
odel,g
enerallin
earregressio
nmod
el.
132 AMERICAN JOURNAL OF MEDICAL GENETICS PART C (SEMINARS IN MEDICAL GENETICS) RESEARCH REVIEW
TABLEIII.
COSMIN
Sco
ringoftheMethodologicalQualityofEachStudyonMeasuremen
tProperties
(Reliabilityan
dValidity)
Reliability
Validity
Assessm
ent
metho
ds/
reference
Stud
y(n)/
age
Design/result
Cut-offscore
COSM
INscore
Reference
Stud
y(n)/age
Design/result
Cut-off
score
COSM
INscore
Clin
icaltests
Carter&
Wilkinson(0–5)
(4bilateralBStestsþ
1ankletest)
Bulbena
etal.
(1992)
n¼30
(meanage
41yrs.
[sym
p.],
48yrs.
[con
trol])
Kappa:
substantial/
almostperfect
–Po
or (onlyon
emeasurement,
lackingof
info
abou
tsubject
eligibility
criteria)
Hyp.test.
(CW
vs.RQ/B
S)Pearsoncorr.
High(C
Wvs.RQ)
Very
high
(CW
vs.BS)
%agreem
ent
CW
�3:96–98%
–Po
or��
�
(other
imp.meth.
flaws:
subjecteligibility
criteriainadeq.
described/lackingfor
thecontrolgrou
p)
Beighton(0–9)
Erkulaet
al.
(2005)
n¼50
(meanage
10yrs)
Spearm
ansrho:
Mod
erate
(intra)
High(in
ter)
�7/9
(classif.
assig
nificant
joint
laxity)
Poor
�
(onlyon
emeasurement)
Hyp.test.
(BSvs.ROM)
Chi
2 -test:(P
¼0.008)
BSvs.scap.
asym
metry
(P¼0.028)
BSvs
leftscap.
elevation
�7/9
Poor (noinfo
onthe
measurementprop.of
thecomparatorinstr.,
otherim
portantmeth.
flaws-lackingof
info
abou
tsubjecteligibility
criteria)
Hirschet
al.
(2007)
n¼50
(meanage
38yrs,
range
20–60)
ICC:Excellent
(intra/inter)
–Po
or�
(onlyon
emeasurement)
Hirschet
al.
(2008)
n¼895
(meanage
40.6
yrs,
SD:11.6
yrs)
Hyp.test
(BSvs.TMDs)
OddsRatio:
Jointclick
(OR:1.56
(1.01–2.39)
Com
posite
scores
categoris.:
(0:cat.1)
(1–3:
cat.2)
(4–9:
cat.3)Fa
ir
Jungeet
al.
n¼39
Kappa:
�5/9
Poor
�Jungeet
al.
n¼109
Hyp.test
�5/9
Fair
continued
RESEARCH REVIEW AMERICAN JOURNAL OF MEDICAL GENETICS PART C (SEMINARS IN MEDICAL GENETICS) 133
TABLEIII.
(Continued)
Reliability
Validity
Assessm
ent
metho
ds/
reference
Stud
y(n)/
age
Design/result
Cut-offscore
COSM
INscore
Reference
Stud
y(n)/age
Design/result
Cut-off
score
COSM
INscore
(2013)
(2different
BS
metho
ds)
(aged
7–8
and
10–12
yrs)
mod
erate/
substantial,
substantial/
almost
perfect
(metho
dA)
Fair/m
oderate,
substantial/
almostperfect
(metho
dB)
�5/9:sub
stantial
(A),mod
erate
(B)
(onlyon
emeasurement)
(2013)
Jungeet
al.
(2015)
n¼999
(range
9–14
yrs)
(BSvs.BSAþB)
McN
emar
sign
prob
ability
test
Prevalence
onMetho
dA
andB:
Nodifference
(P¼0.54)
Hyp.test.
(BSvs.injuries)
OddsRatio:
Traum
atic
inj
(OR:1.56
[0.43–5.61])
�5/9
Poor
��
(inadequate
info
onthe
measurement
prop
ertiesof
the
comparator
instrument)
Juul- Kristensen
etal.(2007)
n¼40
(meanage
34yrs
[sym
p.],
46yrs
[asymp.])
Kappa:
substantial
(curr),
substantial
(curr/hist)
ICC:Excellent
(curr/hist)
�5/9
Poor
�
(onlyon
emeasurement)
Sohrbeck-
Nøh
retal.
(2014)
n¼301
(meanage
14yrs,range
13–15
yrs)
Hyp.test.
(BSandarthralgia):
OddsRatio:
BS�5
/9(O
R:3.00,
[0.94–9.60])
–Fair
Mikkelsson
etal.(1996)
n¼29 (inter)
n¼13
(intra)
(mean
Kappa:
Substantial
(intra/inter)
�6/9
Poor
(only1
measure.,
smallsample
size,
time
intervalno
tapprop
riate)
El-Metwally
etal.(2004)
n¼430
(meanage
9.8and
11.8
yrs)
Hyp.test.
(BSandpain)
Risk
Ratio:
(RR
¼1.35
[1.08–1.68])
�6/9
Fair
continued
134 AMERICAN JOURNAL OF MEDICAL GENETICS PART C (SEMINARS IN MEDICAL GENETICS) RESEARCH REVIEW
TABLEIII.
(Continued)
Reliability
Validity
Assessm
ent
metho
ds/
reference
Stud
y(n)/
age
Design/result
Cut-offscore
COSM
INscore
Reference
Stud
y(n)/age
Design/result
Cut-off
score
COSM
INscore
ages
9and11
yrs)
El-Metwally
etal.(2005)
El-Metwally
etal.(2007)
n¼1,284
(meanage
9.8and
11.8
yrs)
n¼1,113
(meanage
14yrs,
range
13–15
yrs)
OddsRatio:
(OR
¼2.93
[1.13–7.70])
OddsRatio:
BSvs.no
n-traum.
pain
(OR:0.83
[0.44–1.56])NS
Traum
atic
pain
(OR:
0.70
[0.16–3.04])NS
�6/9
�6/9
Fair
Fair
Aslanet
al.
(2006)
(BHJM
I)
n¼72
(meanage
20yrs,
range
18–25)
ICC:Excellent
(intra/inter)
Com
posite
scores
categorized
(0–2:
cat.1)
(3–4:
cat.2)
(5–9:
cat.3)
Poor
(other
impo
rtant
meth.
flaws,
inadequate
statistics
applied)
––
––
Boyle
etal.
(2003)
(BHJM
I)
n¼36/42
(meanage
25yrs,
range
15–45)
Spearm
ansrho:
Excellent
(intra/inter)
Com
posite
scores
categorized
(0–2:
cat.1)
(3–4:
cat.2)
(5–9:
cat.3)
Poor
(tim
eintervalno
tapprop
riate,
inadequate
statistics
applied)
––
––
Bulbena
etal.
(1992)
n¼30
(meanage
41yrs.
[sym
p.],
48yrs.
Kappa:
Substantial/
almostperfect
–Po
or (onlyon
emeasurement,
lackingof
info
abou
tsubject
eligibility
Hyp.test.
(BSvs.CW/R
Q)
Pearsons
corr.
Very
high
(BSvs.CW)
High(BSvs.RQ)
–Po
or��
�
(other
imp.
meth.
flaws:subjecteligibility
criteriainadeq.
described/lackingfor
thecontrolgrou
p)continued
RESEARCH REVIEW AMERICAN JOURNAL OF MEDICAL GENETICS PART C (SEMINARS IN MEDICAL GENETICS) 135
TABLEIII.
(Continued)
Reliability
Validity
Assessm
ent
metho
ds/
reference
Stud
y(n)/
age
Design/result
Cut-offscore
COSM
INscore
Reference
Stud
y(n)/age
Design/result
Cut-off
score
COSM
INscore
[con
trol])
criteria)
Hansenet
al.
(2002)
(4/5
tests,no
5th
finger)
n¼100
(range
9–13
yrs)
Kappa:
Mod
erate/
substantial,
substantial/
almostperfect
(experts)Fair
(inexp.)
–Po
or (only1
measurem.,
testcond
ition
sno
tsim
ilar)
––
––
–
Hicks
etal.
(2003)
n¼63
(meanage
36yrs,
range
20–66)
ICC:Fair/goo
d,good
/excellent
–Po
or�
(onlyon
emeasurement)
––
––
–
Karim
etal.
(2011)
n¼30
(meanage
24yrs,
range
18–32)
Kappa:
Mod
erate
–Po
or�
(onlyon
emeasurement)
––
––
–
––
––
–Cam
eron
etal.
(2010)
n¼714
(meanage
18.8
yrs,
SD:1yr)
Hyp.test.
(BSvs.injuries)
OddsRatio
GJI(O
R2.48
[1.19–
5.20])
(P¼0.16)
�2/9
(95th
percentile)
Fair
––
––
–Ferrariet
al.
(2005)
n¼21(1)/88
(2)/116
(3)
Hyp.test.
(BSvs.othertests)
%agreem
ent
BSvs.LL
AS:
�5/9
Fair
continued
136 AMERICAN JOURNAL OF MEDICAL GENETICS PART C (SEMINARS IN MEDICAL GENETICS) RESEARCH REVIEW
TABLEIII.
(Continued)
Reliability
Validity
Assessm
ent
metho
ds/
reference
Stud
y(n)/
age
Design/result
Cut-offscore
COSM
INscore
Reference
Stud
y(n)/age
Design/result
Cut-off
score
COSM
INscore
(range:
5–16
yrs)
69%
(1),80%
(3)
Pearsoncorrelation:
Low
(1),High
(2),High(3)
––
––
–Naalet
al.
(2014)
n¼55
(meanage
28.5
yrs,
SD 4.1yrs.)
Hyp.test.
(BSvs.Hip
ROM)
Spearm
anscorrelationBS
vs.Hipflex:
0.61
Int.
rot:0.56
Ext.rot:0.44
(allwith
P<0.01)
�4/9
(BS)
�6/9
(BS)
Poor
� (inadequate
info
onthemeasurementprop.
ofcomparatorinstr.)
––
––
–Nijs
etal.
(2004)
n¼44
(meanage
40.2
yrs,
SD:9.11)
Hyp.test.
(BSvs.CFS
)Associatio
nbetw
eenBS
andCFS
CFS
:63.6%
Non
-CFS
:36.5%
(Fischersexacttest:
P¼0.73)
CFS
-APQ
1:P¼0.75
CFS
-APQ
2:P¼0.69
(non
-sign)
�4/9
Fair
––
––
–Pearsallet
al.
(2006)
n¼57
(meanage
20.9
yrs,
SD 1.45
yrs)
Hyp.test
(BSvs.ROM)
Spearm
anscorrelation
BSvs.Kneelax:0.37
(P¼0.110)
AP
anklelax:
0.21,
P¼0.152IE
anklelax:
0.24,
P¼0.101
–Fair
continued
RESEARCH REVIEW AMERICAN JOURNAL OF MEDICAL GENETICS PART C (SEMINARS IN MEDICAL GENETICS) 137
TABLEIII.
(Continued)
Reliability
Validity
Assessm
ent
metho
ds/
reference
Stud
y(n)/
age
Design/result
Cut-offscore
COSM
INscore
Reference
Stud
y(n)/age
Design/result
Cut-off
score
COSM
INscore
––
––
–Rou
sseletal.
(2009)
n¼32/26
(mean20
yrs,SD
:2
yrs)
Hyp.test.
(BSvs.injuries)
PI/L
BP(Spearman
correlation:
(Rho
¼�0
.03;
P¼0.89)PI,LB
P
Com
posite
scores
subgr.:
(0–3:
gr.1,
tight)(4–6:
gr.2,h
yp.)
(7–9:gr.3,
extr.h
yp.)
Poor
(noinfo
onthe
measurementprop.of
comparatorinstr.,
otherim
portantmeth.
flaws-lackingof
info
abou
tsubjecteligibility
criteria)
––
––
–Sauerset
al.
(2001)
n¼51/102
(meanage
22yrs,SD
2.8yrs)
Hyp.test.
(BSvs.ROM)
Pearsons
correlation
BSvs.instrumented
APlaxscore:
(0.23,
Non
-S)
Clin
passROM:(0.01
–0.48,Non
-S)
–Fair
––
––
–Sm
its-
Engelsm
anet
al.
(2011)
n¼551
(meanage8
yrs,range
6–12
yrs)
Hyp.test.
(BSvs.ROM)
Variance
analysis:
Sign.diffin
mean
ROM
betw
eenthe
3BSgrou
ps(P
<0.001,
except
knee
flexP¼0.02;
hipextP¼0.06)
�7/9
Poor (nodescriptionof
the
constructsmeasuredby
thecomparatorinstr.,
noinfo
onthe
measurementprop.)
––
––
–Terziet
al.
(2013)
n¼240
Hyp.test.
(BSvs.AC)
–Po
or��
(noinfo
onthe
continued
138 AMERICAN JOURNAL OF MEDICAL GENETICS PART C (SEMINARS IN MEDICAL GENETICS) RESEARCH REVIEW
TABLEIII.
(Continued)
Reliability
Validity
Assessm
ent
metho
ds/
reference
Stud
y(n)/
age
Design/result
Cut-offscore
COSM
INscore
Reference
Stud
y(n)/age
Design/result
Cut-off
score
COSM
INscore
(meanage
53.9–
54.08yrs,
SD:8.21–
10.68yrs)
Chi-squ
areGJH
inAC
vs.SIS:
(0.8%
vs.7.5%
,P¼0.010)
measurementprop.of
thecomparatorinstr.)
––
–To
bias
etal.
(2013)
n¼2901
(meanage
13.8
yrs)
Hyp.test.(BSandpain)
OddsRatio:Sh
oulder
(OR
1.68
[1.04–
2.72])Knee(O
R1.83
[1.10–3.02])Ank
le/
foot
(OR
1.82
[1.05–
3.16])
�6/9
Poor
��
(noinfo
onthe
measurementprop.of
thecomparatorinstr.)
Rot�es-Q
u�erol(0–11)
Bulbena
etal.
(1992)
n¼30
(meanage
41yrs.
[sym
p.],
48yrs.
[con
trol])
Kappa:
Mod
erate/
substantial,
substantial/
almostperfect
–Po
or(onlyon
emeasurement,
lackingof
info
abou
tsubject
eligibility
criteria)
Hyp.test.
(RQ
vs.
CW/B
S)
PearsoncorrelationHigh
(RQ
vs.BS)
High
(RQ
vs.CW)
–Po
or��
�
(other
imp.
meth.flaws:
subjecteligibility
criteriainadeq.
described/lackingfor
thecontrolgrou
p)
Juul- Kristensen
etal.(2007)
(3tests)
n¼40
(meanage
34yrs
[sym
p.],
46yrs
[asymp.])
Kappa:Fair/
mod
erate,
mod
erate/
substantial
(curr)fair/
mod
erate,
substantial/
almostperfect
(curr/hist)
ICC:
Excellent
–Po
or�
(onlyon
emeasurement)
––
––
–
continued
RESEARCH REVIEW AMERICAN JOURNAL OF MEDICAL GENETICS PART C (SEMINARS IN MEDICAL GENETICS) 139
TABLEIII.
(Continued)
Reliability
Validity
Assessm
ent
metho
ds/
reference
Stud
y(n)/
age
Design/result
Cut-offscore
COSM
INscore
Reference
Stud
y(n)/age
Design/result
Cut-off
score
COSM
INscore
(curr/hist)
HospitaldelMar
(0–10)
Design
Bulbena
etal.
(1992)
n¼30
(meanage
41yrs.
[sym
p.],
48yrs.
[con
trol])
Kappa:
Substantial/
almostperfect
–Po
or(onlyon
emeasurement,
lackingof
info
abou
tsubject
eligibility
criteria)
––
––
–
Chahalet
al.
(2010)
n¼57/92
(Meanage
24yrs,
SD:0.32)
––
–Hyp.test.
(HdM
and
injuries)
OddsRatio:
AllGLL
�(O
R:2.79
[1.27–6.09])
GLL
þER�
(OR:3.6
[1.49–8.68])
Men G
LL�(O
R:7.43
[2.13–25.57])
GLL
þER�[O
R:
6.75
(1.92–23.36])
�4/10
(men)
�5/10
(wom
en)
Fair
Questionn
aires
5-partqu
estio
nnaire
(0–5)
Bulbena
etal.
(2014)
(þ2Q
)
n¼33
(meanage
35yrs)
ICC:Excellent
Tau-kendall
index:
Very
high
corr.
� 3/7
Poor
� (onlyon
emeasurement)
Hyp.test.
(5PQ
vs.
HdM
)
Spearm
anscorrelation
5PQþ2
QandHDM:
(rho
¼0.75;
P<0.001)
5PQþ2Q
and5P
Q:
(rho
¼0.86;
P<0.001)
�3/7
Poor
��
(noinfo
onthe
measurementprop.of
thecomparatorinstr.)
continued
140 AMERICAN JOURNAL OF MEDICAL GENETICS PART C (SEMINARS IN MEDICAL GENETICS) RESEARCH REVIEW
TABLEIII.
(Continued)
Reliability
Validity
Assessm
ent
metho
ds/
reference
Stud
y(n)/
age
Design/result
Cut-offscore
COSM
INscore
Reference
Stud
y(n)/age
Design/result
Cut-off
score
COSM
INscore
Moraeset
al.
(2011)
n¼211
(range
17–24
yrs)
Kappa:
Substantial
(Q1,
Q2,
Q3)
Mod
erate
(Q4,
Q5)
�2/5
Poor
�
(onlyon
emeasurement)
n¼394
Hyp.test.
(5PQ
vs.
BS)
Criterion
validity
%agreem
ent73.6%
Sens.(m
ean)
0.69
Spec.(m
ean)
0.75
�2/5
Fair (other
minor
meth.
flaws:inadequate
descriptionon
demog
raph
icdetails,
noblinding)
Fair (o
ther
minor)
––
––
Hakim
and
Grahame
(2003)
n¼212
(range
15–80
yrs)
Hyp.test.
(5PQ
vs.
BJH
S)
Criterion
validity
Sens.(m
ean)
77–85%
(1stand2n
dcoho
rt)
Spec. 80–89%
(1stand2n
dcoho
rt)
�2/5
Poor
��
(noinfo
onthe
measurementprop.of
thecomparatorinstr.)
Fair (o
ther
minor
meth.
flaws:inadequate
descriptionon
demog
raph
icdetails,
nodescriptionof
blinding)
––
––
Mulveyet
al.
(2013)
n¼2,354
(medianage
55yrs,
range
25–107yrs)
Hyp.test.
(5PQ
vs.
CWP)
Relativerisk
ratio
:JH
andCWP
gradeI:(R
RR
1.2;
P¼0.03
–mod
est
ass.)
gradeII:(R
RR
1.2,
P¼0.1–mod
est
ass.,
notstat.sig
n.)
gradeIII/IV:(R
RR
1.4;
P¼<0.001)
�2/5
Fair
continued
RESEARCH REVIEW AMERICAN JOURNAL OF MEDICAL GENETICS PART C (SEMINARS IN MEDICAL GENETICS) 141
TABLEIII.
(Continued)
Reliability
Validity
Assessm
ent
metho
ds/
reference
Stud
y(n)/
age
Design/result
Cut-offscore
COSM
INscore
Reference
Stud
y(n)/age
Design/result
Cut-off
score
COSM
INscore
––
––
Sancheset
al.
(2014)
n¼2300
(meanage21
yrs,SD
:3.25)
Hyp.test.
(5PQ
vs.
anxiety
disease)
Pearsons
correlation
5PQ
andBAItotal
score(W
omen:
r¼0.11,P¼0.007,
weakass)(M
en:
r¼0.04,P¼0.450,
notsig
nass)
�2/5
Fair
Beightonself-repo
rted
(0–9)
––
––
Naaletal.(2014)
n¼55
(meanage
28.5,SD
:4.1yrs)
Hyp.test.
(BSs-rvs
BS)
Kappa-values(L
w)To
tal:
0.82
(0.72–0.91)
Single
tests(range):
0.61–0.96
�4/9
Poor
��
(noinfo
onthe
measurementprop.of
thecomparatorinstr.)
BS,BeightonScore;CW,C
arter&Wilkinson;
HDM,H
ospitaldelMar;R
Q,R
ot�es-Q
u�erol;LL
AS,Lo
wer
LimbAssessm
entS
core;5PQ
,5-partq
uestionn
aire;m
easure,m
easurement;
meth,
metho
dological;prop,prop
erties;GJH
,generalized
jointhyperm
obility,HMS,
hyperm
obility
synd
rome;
GLL
,generalized
ligam
entous
laxity;JH
,jointhyperm
obility;BJH
S,benign
jointhypermob
ilitysynd
rome;hyp,hypo
thesis;
instr,instruments;scap,scapular;lax,laxity
;ROM,range
ofmotion;incr,increased;s-r,self-repo
rted;p,persistent;r,recurrent;inc,
incidence;classif,classified;cat,categorized;diff,difference;sign,sig
nificant;traum,traum
atic;G
JI,gleno
humeraljointinstability;ER,externalrotation;ASD
,Primaryanterior
shou
lder
142 AMERICAN JOURNAL OF MEDICAL GENETICS PART C (SEMINARS IN MEDICAL GENETICS) RESEARCH REVIEW
reported on 5PQ, which shows to be apromising assessment method for futurepopulation studies, where differentdomains of validity on GJH thus, canbe studied more carefully. The addi-tional questionnaire, BS-self, may alsoseem promising, as it contains illustra-tions of the test procedures for each ofthe BS tests. However, the questionnaireassessment methods need more evalua-tion before they can be used clinically,since very few studies have reportedmeasurement properties on their reli-ability and validity.
The current review covers a widerange of populations, children, andadults (for adults comprising 64%[7/11] on reliability, and 62% [16/26]on validity), men and women (threestudies on separately men or women),and different ethnic groups (mostlyCaucasian, few on American/Cana-dian/Brazilian). For children, only BShas been used, while for adults, differ-ent test and questionnaire assessmentmethods have been used, though, stillmostly BS, and 5PQ in populationstudies.
The current review demonstratescut-points varying for the differentclinical assessment methods. In theadult population, when using nine testsfor BS, mostly one cut-point was usedfor classifying GJH varying between 4and 6, but one study used 2/9[Cameron et al., 2010]. However,also two cut-points were used, with alower cut-point for “tight/not hyper-mobile” individuals varying between 1and 4, and an upper cut-point for“hypermobile/extremely hypermo-bile” individuals varying between 4and 7 [Boyle et al., 2003; Aslan et al.,2006; Hirsch et al., 2008; Rousselet al., 2009]. For the questionnaireassessment methods only one cut-pointwas used for classifying GJH, varyingbetween 2 and 3 and with a differenttotal score varying between 5 and 7[Hakim and Grahame, 2003; Bulbenaet al., 2014].
Generally, for adults, one cut-point, varying from 4 to 5 was usedin BS (4/9 and 5/9), and 2/5 in 5PQhave been used. For children, one cut-point varying from 5 to 7 was used in
TABLE IV. Levels of Evidence of Included Studies
Reliability Validity
Intra/inter Quality/(n)/pop Hypothesis testing Quality/(n)/pop
Clinical assessment toolsBeighton Aslan06: þþ P (72) vs. other tests:
Boyle03: þþ P (42/36) (CW/RQ) Bulbena92: þ P��� (173)Bulbena92: þ P (30) (LLAS) Ferrari05: þ F (225) CErkula05: þþ P� (50) C (A/B) Junge13: � F (103) CHansen02: � P (100) C (þ) to (þ/�) LimitedHicks03: þ P� (50) pos. to conflictingHirsch07: þþ P� (50)Junge13: � P� (39) C vs. ROM:Juul-Kr07: þ P� (40) (Trunkrot.) Erkula05: � P (1273) CKarim11: � P� (30) (Hip) Naal14: � P�� (55)Mikkelss.96:þþ P (29/13) C (k/a lax) Pearsall06: � F (57)
(sh lax) Sauers01: � F (51)Ia (gon) Smits-Eng11: þ P (551) C(þ) Limited pos. (�) to (þ) Limited neg. to
conflictingIe(þ) to (þ/�) Limited pos. toConflicting
vs. pain:F (430) C(p/r) El-Metwally04: þ
(LLP) El-Metwally05: þ F (1284) C(inc) El-Metwally07: � F (1113) C(inc) Sohrb.-Nøhr14: þ F (301) C(s/k/a)(inc/p)Tobias13:þ P�� (2901) C(þþ) to (þ/�)Moderate pos. to Conflicting
BS vs. injuries:(GI) Cameron10: þ(KI) Junge15: �
F (714)C
(PI) Roussel09: �P�� (999)
(þ/�) ConflictP (58)
vs. diseases:(TMs) Hirsch08: þ(CFS) Nijs04: �
F (895)
(ACprev) Terzi13: þF (44)
(þ)to (þ/�) LimitedP�� (240)
pos. to conflicting
Carter & Wilkinson Bulbena92: þ P (30) vs. other tests:P��� (173)(BS/RQ) Bulbena92: þInter: Unknown
UnknownRot�es-Qu�erol Bulbena92: þ P (30) vs. other tests:
Juul-Kr07: þ (3) P�(40) (CW/BS) Bulbena92: þ P��� (173)
Inter: Unknown Unknown
Hospital del Mar Bulbena92: þ P (30) (ASD) Chahal10: þ F (149)
Inter: Unknown Unknown
continued
RESEARCH REVIEW AMERICAN JOURNAL OF MEDICAL GENETICS PART C (SEMINARS IN MEDICAL GENETICS) 143
TABLE IV. (Continued)
Reliability Validity
Intra/inter Quality/(n)/pop Hypothesis testing Quality/(n)/pop
Questionnaires5-part Q. Bulbena14: þ
Moraes11: �Retest: (þ/�)Conflicting
P� (33)P� (211)
vs. other tests:(HdM) Bulbena14: þ(BS) deMoraes11: �/þ(þ) to (þ/�) Limitedpos. to Conflicting
P�� (191)F (394)
Criterion val.(BS) deMoraes11: �(BS) Hakim03:?Unknown
F (394)F (489)
vs. pain/tissue diseases:(CWP) Mulvey13: þ (BJHS)Hakim03: þ(þ) Limited pos.
F (2354)P�� (489)
vs. anxiety/psych. disease:(anx) Sanches14: �Unknown
F (2300)
BS-self-reported vs. tests:(BS) Naal14: þUnknown
P�� (55)
Ia, Intra-rater; Ie, Inter-rater; P, poor; F, fair; pop, population; BS, Beighton Score; CW, Carter & Wilkinson; HdM, Hospital del Mar; RQ,Rot�es-Qu�erol; 5-part Q, 5-part questionnaire; LLAS, Lower Limb Assessment Score; k/a lax, Knee/ankle laxity; sh lax, Shoulder laxity; gon,goniometry; p, persistent; r, recurrent; inc, incidence; s/k/a, shoulder/knee/ankle; GI, Glenohumeral joint instability; Y, youth; C, children,AID, Primary traumatic anterior shoulder dislocation; KI, knee injuries; PI, pelvic injuries; TMs, Temporomandibular symptoms; CFS, chronicfatigue syndrome; ACprev, Adhesive capsulitis prevalence; ASD, Anterior Shoulder Dislocation; CWP, chronic widespread pain; BJHS, benignjoint hypermobility syndrome; Anx, anxiety; 92, Superscripts shows year of publication.�Reliability studies rated poor on the basis of one single rating, based on ”only one measurement.”��Validity studies rated poor on the basis of one single rating “no information on the measurement properties of the comparator instrument(s).”
144 AMERICAN JOURNAL OF MEDICAL GENETICS PART C (SEMINARS IN MEDICAL GENETICS) RESEARCH REVIEW
BS (5/9, 6/9, and 7/9). In one testassessment method the total score was11 with a cut-point of 7/11 forclassifying GJH based on only tests inthe lower extremities [Ferrari et al.,2005], and another study used two cut-points the lower being 5/9 and theupper being 7/9 [Smits-Engelsmanet al., 2011]. Since JHS and hEDS inthe current review are recognized asone and the same condition, a specificcut-point needs to be decided, and 5/9may be suggested for future use inadults. However, since joint mobility,and therefore, BS is known to decreaseby age [Remvig et al., 2007b], there isa need for adults also to includeadditional historical information, as
described in the appendix of thereliability study using the BS, withphrasing “can you now or have youpreviously been able to . . . ” [Juul-Kristensen et al., 2007], and in thestudy describing 5PQ [Hakim andGrahame, 2003].
Generally, for adults, one cut-point, varying from 4 to 5 wasused in BS (4/9 and 5/9), and2/5 in 5PQ have been used.For children, one cut-point
varying from 5 to 7 was usedin BS (5/9, 6/9, and 7/9).
Since children have individualgrowth periods, this may be the reasonfor using two cut-points (a lower and anupper) as recently suggested [Smits-Engelsman et al., 2011], and therefore,the upper cut-point is suggested to be atleast 6/9 as used in previous populationstudies [El-Metwally et al., 2004, 2005,2007; Tobias et al., 2013].
Warming-up before performingflexibility tests may influence theoutcome of a test assessment method.However, almost no studies reportedwhether participants did warm-up, and
RESEARCH REVIEW AMERICAN JOURNAL OF MEDICAL GENETICS PART C (SEMINARS IN MEDICAL GENETICS) 145
the influence of such performance istherefore, unknown.
This review highlights a numberof areas warranting future research.Because of the limited studies on theclinical assessment methods for classi-fying GJH, more high quality studies,and especially those evaluating aspectsof validity are required (concurrent,predictive, measurement error, respon-siveness, and interpretability). Addi-tional clinical test assessment methodsmay further be considered in order tosupport and endorse the presence ofGJH in the diagnostic procedure ofheritable connective tissue disorders.Also of importance is that consensus iswarranted regarding selection of spe-cific test and questionnaire assessmentmethods for classifying GJH, the testperformance, and the cut-points bywhich age, gender, and ethnicity maybe taken into account.
Limitations of the study are thesmall amount of studies, for whichreason it was decided only to ratereliability (intra- and inter-rater) andvalidity (hypothesis testing or criterionvalidity). Use of COSMIN is recom-mended to be the best evaluationmethod until now, as has also beenused previously to evaluate clinical testassessment methods [Kroman et al.,2014; Larsen et al., 2014]. However,since the COSMIN originally wasdesigned for the evaluation of pa-tient-reported outcomes, there aresome adjustments that need to beconsidered when using COSMIN forclinical test assessment methods. Forexample, although rating of the num-ber of measurements taken may beuseful in settings with continuous scalesas in performance-based methods,rating scales in many clinical assessmentmethods (test or questionnaire) aredichotomous (positive/negative). Toadjust for this shortcoming, the presentreview adjusted the evaluation ofmethodological quality, correspondingto when “only one measurement” wasrated poor in reliability, the study wasupgraded from poor to fair, meaningthat the study thereby could beincluded in the best evidence synthesis.Furthermore, the sample size in clinical
studies is often much smaller than inquestionnaire studies, and therefore, itmay be suggested that minor samplesizes should not be rated that strictly asin questionnaire assessment methods,when studying clinical test assessmentmethods. For validity studies, one poorrating including “no information onthe measurement properties of thecomparator instrument” or “subjecteligibility criteria inadequately de-scribed/lacking,” allowed upgradingto fair, and the study could therebybe included in the best evidencesynthesis.
Strengths of this review are thesystematic and rigid use of recom-mended strategies for systematic re-views of clinical assessment methods,the evaluation of their clinimetricproperties and rating of the bestevidence synthesis.
CONCLUSION
In the current review, four test and twoquestionnaire assessment methods forclassifying GJH were found with mea-surement properties of varying method-ological strength and results of varyingweight. Most of the studies used the BS.The inter-rater reliability of this methodseems acceptable to be used in clinicalpractice, provided that uniformity oftesting procedures are included in thetesting procedures, in addition to his-torical information, especially in adults.However, shortcomings were found instudies on the validity of BS, while thethree other test assessment methods(CW, RQ, HdM) lack satisfactoryinformation on both reliability andvalidity. Regarding questionnaire assess-ment methods, 5PQ is the most fre-quently method used, however, only inadult population studies. In conclusionprovided uniformity of testing proce-dures, the recommendation for clinicaluse in adults is BS with cut-point of 5 of9 including historical information,while in children it is BS with cut-pointof at least 6 of 9. However, more studiesare needed to conclude, especially onthe validity properties of these assess-ment methods, and before evidence-based recommendations can bemade for
clinical use on the “best” assessmentmethod for classifying GJH.
In the current review, four testand two questionnaireassessment methods for
classifying GJH were foundwith measurement propertiesof varying methodological
strength and results of varyingweight. Most of the studies
used the BS.
AUTHORS’CONTRIBUTIONS
BJK contributed to conception anddesign of the study, including analysisand interpretation of data, writing of thearticle, critical revision of the article forimportant intellectual content, and finalapproval of the article. KS contributedto the collection and assembly of data,analysis and interpretation of data, andcritical revision of the article forimportant intellectual content and finalapproval of the article. RE and HLcontributed to conception and design ofthe study including analysis and inter-pretation of data, critical revision of thearticle for important intellectual con-tent, and final approval of the article. LRcontributed to interpretation of data,critically revision of the article forimportant intellectual content, and finalapproval of the article. First and lastauthors take responsibility for the integ-rity of the work as a whole, frominception to finished article.
ACKNOWLEDGMENT
Dr. JaimeBravo,Chile, andHylkeBosma,previous patient representative of theEhlers–Danlos Syndrome organization,the Netherlands, are acknowledged fortheir initial participation in this work andthe Beighton group.
146 AMERICAN JOURNAL OF MEDICAL GENETICS PART C (SEMINARS IN MEDICAL GENETICS) RESEARCH REVIEW
REFERENCES
Aslan BA, Celik E, Cavlak U, Akdag B. 2006.Evaluation of inter-rater and intra-raterreliability of BEighton and Horan JointMobility Index. Fizyoterapi Rehabil17:113–119.
Beighton P, De Paepe A, Steinmann B, TsipourasP, Wenstrup RJ. 1998. Ehlers–Danlos syn-dromes: Revised nosology, villefranche,1997. Ehlers–Danlos national foundation(USA) and Ehlers–Danlos support group(UK). Am J Med Genet 77:31–37.
Beighton P, Solomon L, Soskolne CL. 1973.Articular mobility in an African population.Annl Rheum Dis 32:413–418.
Boyle KL, Witt P, Riegger-Krugh C. 2003. Intra-rater and inter-rater reliability of theBeighton and Horan Joint Mobility Index.J Athl Train 38:281–285.
Bulbena A, Duro JC, Porta M, Faus S, Vallescar R,Martin-Santos R. 1992. Clinical assessmentof hypermobility of joints: Assemblingcriteria. J Rheumatol 19:115–122.
Bulbena A, Mallorqui-Bague N, Pailhez G,Rosado S, Gonzalez I, Blanch-Rubio J,Carbonell J. 2014. Self-reported screeningquestionnaire for the assessment of JointHypermobility Syndrome (SQ-CH), a col-lagen condition, in Spanish population. Eur JPsychiat 28:17–26.
Cameron KL, Duffey ML, DeBerardino TM,Stoneman PD, Jones CJ, Owens BD. 2010.Association of generalized joint hypermo-bility with a history of glenohumeral jointinstability. J Athl Train 45:253–258.
Castori M, Dordoni C, Valiante M, Sperduti I,Ritelli M, Morlino S, Chiarelli N, CellettiC, Venturini M, Camerota F, Calzavara-Pinton P, Grammatico P, Colombi M. 2014.Nosology and inheritance pattern(s) of jointhypermobility syndrome and Ehlers–Danlossyndrome, hypermobility type: A study ofintrafamilial and interfamilial variability in23 Italian pedigrees. Am JMed Genet Part A164A:3010–3020.
Chahal J, Leiter J, McKeeMD,Whelan DB. 2010.Generalized ligamentous laxity as a predis-posing factor for primary traumatic anteriorshoulder dislocation. J Shoulder Elbow Surg19:1238–1242.
El-Metwally A, Salminen JJ, Auvinen A,Kautiainen H, Mikkelsson M. 2004. Prog-nosis of non-specific musculoskeletal painin preadolescents: A prospective 4-yearfollow-up study till adolescence. Pain110:550–559.
El-Metwally A, Salminen JJ, Auvinen A, Kautiai-nen H, Mikkelsson M. 2005. Lower limbpain in a preadolescent population: Progno-sis and risk factors for chronicity—Aprospective 1- and 4-year follow-up study.Pediatrics 116:673–681.
El-Metwally A, Salminen JJ, Auvinen A, Macfar-lane G,MikkelssonM. 2007. Risk factors fordevelopment of non-specific musculoskele-tal pain in preteens and early adolescents: Aprospective 1-year follow-up study. BMCMusculoskelet Disord 8:46.
Erkula G, Kiter AE, Kilic BA, Er E, Demirkan F,Sponseller PD. 2005. The relation of jointlaxity and trunk rotation. J Pediatr OrthopPart B 14:38–41.
Ferrari J, Parslow C, Lim E, Hayward A. 2005.Joint hypermobility: The use of a newassessment tool to measure lower limbhypermobility. Clin Exp Rheumatol 23:413–420.
Fleiss JL. 1986. Reliability of measurement. Thedesign and analysis of clinical experiments.New York: John Wiley and Sons. pp. 1–32.
Grahame R, Bird HA, Child A. 2000. The revised(Brighton 1998) criteria for the diagnosis ofbenign joint hypermobility syndrome(BJHS). J Rheumatol 27:1777–1779.
Hakim AJ, Grahame R. 2003. A simple question-naire to detect hypermobility: An adjunct tothe assessment of patients with diffusemusculoskeletal pain. Int J Clin Pract 57:163–166.
Hansen A, Damsgaard R, Kristensen JH, BaggersJ, Remvig L. 2002. Interexaminer reliabilityof selected tests for hypermobility. J OrthopMed 25:48–51.
Hicks GE, Fritz JM, Delitto A, Mishock J. 2003.Interrater reliability of clinical examinationmeasures for identification of lumbar seg-mental instability. Arch Phys Med Rehab84:1858–1864.
Hirsch C, Hirsch M, John MT, Bock JJ. 2007.Reliability of the Beighton HypermobilityIndex to determinate the general joint laxityperformed by dentists. J Orof Orthop68:342–352.
Hirsch C, John MT, Stang A. 2008. Associationbetween generalized joint hypermobilityand signs and diagnoses of temporomandib-ular disorders. Eur J Oral Sci 116:525–530.
Jansson A, Saartok T, Werner S, Renstrom P. 2004.General joint laxity in 1845 Swedish schoolchildren of different ages: Age- and gender-specific distributions. Acta Paediatr 93:1202–1206.
Junge T, Jespersen E, Wedderkopp N, Juul-Kristensen B. 2013. Inter-tester reproduc-ibility and inter-method agreement of twovariations of the Beighton test for determin-ing Generalised Joint Hypermobility inprimary school children. BMC Pediatr13:214.
Junge T, Runge L, Juul-Kristensen B, Wedder-kopp N. 2015. The extent and risk of kneeinjuries in children aged 9–14 with Gener-alised Joint Hypermobility and kneejoint hypermobility—The CHAMPS-studyDenmark. BMC Musculoskelet Disord15:1–11.
Juul-Kristensen B, Rogind H, Jensen DV, RemvigL. 2007. Inter-examiner reproducibility oftests and criteria for generalized joint hyper-mobility and benign joint hypermobilitysyndrome. Rheumatology 46:1835–1841.
Karim A, Millet V, Massie K, Olson S, Mor-ganthaler A. 2011. Inter-rater reliability of amusculoskeletal screen as administered tofemale professional contemporary dancers.Work 40:281–288.
Kroman SL, Roos EM, Bennell KL, Hinman RS,Dobson F. 2014. Measurement properties ofperformance-based outcome measures toassess physical function in young and mid-dle-aged people known to be at high risk ofhip and/or knee osteoarthritis: A systematicreview. Osteoarthr Cartil 22:26–39.
LarsenCM, Juul-KristensenB, LundH, SogaardK.2014. Measurement properties of existingclinical assessment methods evaluating
scapular positioning and function. A system-atic review. Physiother Theory Pract30:453–482.
Landis JR, Koch GG. 1977. The measurement ofobserver agreement for categorical data.Biometrics 33:159–174.
Mikkelsson M, Salminen JJ, Kautiainen H. 1996.Joint hypermobility is not a contributingfactor to musculoskeletal pain in pre-adolescents. J Rheumatol 23:1963–1967.
Moher D, Liberati A, Tetzlaff J, Altman DG. 2009.Preferred reporting items for systematicreviews and meta-analyses: The PRISMAstatement. J Clin Epidemiol 62:1006–1012.
Mokkink LB, Terwee CB, Knol DL, Stratford PW,Alonso J, Patrick DL, Bouter LM, de VetHC. 2010. The COSMIN checklist forevaluating the methodological quality ofstudies on measurement properties: Aclarification of its content. BMC Med ResMethodol 10:22.
Moraes DA, Baptista CA, Crippa JA, Louzada-Junior P. 2011. Translation into BrazilianPortuguese and validation of the five-partquestionnaire for identifying hypermobility.Rev Bras Reumatol 51:53–69.
Mulvey MR, Macfarlane GJ, Beasley M, Sym-mons DP, Lovell K, Keeley P, Woby S,McBeth J. 2013. Modest association of jointhypermobility with disabling and limitingmusculoskeletal pain: Results from a large-scale general population-based survey. Ar-thritis Care Res (Hoboken) 65:1325–1333.
Naal FD, Hatzung G, Muller A, Impellizzeri F,LeunigM. 2014. Validation of a self-reportedBeighton score to assess hypermobility inpatients with femoroacetabular impinge-ment. Int Orthop 38:2245–2250.
Nijs J,Meirleir KD,Truyen S. 2004.Hypermobilityin patients with chronic fatigue syndrome:Preliminary observations. J MusculoskeletPain 12:9–17.
Palmer S, Bailey S, Barker L, Barney L, Elliott A.2014. The effectiveness of therapeutic exercisefor joint hypermobility syndrome: A system-atic review. Physiotherapy 100:220–227.
Pearsall AW, Kovaleski JE, Heitman RJ, GurchiekLR, Hollis JM. 2006. The relationshipsbetween instrumented measurements ofankle and knee ligamentous laxity andgeneralized joint laxity. J Sports Med PhysFitness 46:104–110.
Remvig L, Engelbert RH, Berglund B, BulbenaA, Byers PH, Grahame R, Juul-KristensenB, Lindgren KA, Uitto J, Wekre LL. 2011.Need for a consensus on the methods bywhich to measure joint mobility and thedefinition of norms for hypermobility thatreflect age, gender and ethnic-dependentvariation: Is revision of criteria for jointhypermobility syndrome and Ehlers–Danlossyndrome hypermobility type indicated?Rheumatology 50:1169–1171.
Remvig L, Jensen DV, Ward RC. 2007a. Arediagnostic criteria for general joint hyper-mobility and benign joint hypermobilitysyndrome based on reproducible and validtests? A reviewof the literature. J Rheumatol34:798–803.
Remvig L, Jensen DV,Ward RC. 2007b. Epidemi-ologyof general joint hypermobility and basisfor the proposed criteria for benign jointhypermobility syndrome: Review of theliterature. J Rheumatol 34:804–809.
RESEARCH REVIEW AMERICAN JOURNAL OF MEDICAL GENETICS PART C (SEMINARS IN MEDICAL GENETICS) 147
Rombaut L, Malfait F, Cools A, De Paepe A,Calders P. 2010. Musculoskeletal com-plaints, physical activity and health-relatedquality of life among patients with theEhlers–Danlos syndrome hypermobilitytype. Disabil Rehabil 32:1339–1345.
Roussel NA, Nijs J, Mottram S, Van Moorsel A,Truijen S, Stassijns G. 2009. Altered lumbo-pelvic movement control but not general-ized joint hypermobility is associated withincreased injury in dancers. A prospectivestudy. Man Ther 14:630–635.
Sanches SB, Osorio FL, Louzada-Junior P, MoraesD, Crippa JA, Martin-Santos R. 2014.Association between joint hypermobilityand anxiety in Brazilian university students:Gender-related differences. J PsychosomRes 77:558–561.
Sauers EL, Borsa PA, Herling DE, Stanley RD.2001. Instrumented measurement of gleno-humeral joint laxity and its relationship topassive range of motion and generalized jointlaxity. Am J Sports Med 29:143–150.
Schellingerhout JM, Verhagen AP, Thomas S, KoesBW. 2008. Lack of uniformity in diagnosticlabeling of shoulder pain: Time for a differentapproach. Man Ther 13:478–483.
Scheper MC, Engelbert RH, Rameckers EA,Verbunt J, Remvig L, Juul-Kristensen B.
2013. Children with generalised jointhypermobility and musculoskeletal com-plaints: State of the art on diagnostics,clinical characteristics, and treatment. Bi-omed Res Int 2013:121054.
Scheper MC, Juul-Kristensen B, Rombaut L,Rameckers EA, Verbunt J, Engelbert RH.2016. Disability in adolescents and adultsdiagnosed with hypermobility related disor-ders: Ameta-analysis. Arch PhysMedRehabil97:2174–2187.
Smith TO, BaconH, Jerman E, Easton V, ArmonK,Poland F, Macgregor AJ. 2014. Physiotherapyand occupational therapy interventions forpeople with benign joint hypermobilitysyndrome: A systematic review of clinicaltrials. Disabil Rehabil 36:797–803.
Smits-Engelsman B, Klerks M, Kirby A. 2011.Beighton score: A valid measure for gener-alized hypermobility in children. J Pediatr158:119–123,23 e1–4.
Sohrbeck-Nøhr O, Kristensen J, Boyle E, RemvigL, Juul-Kristensen B. 2014. Generalizedjoint hypermobility in childhood is apossible risk for the development of jointpain in adolescence: A cohort study. BMCPediatr 14:1–9.
Terwee CB, Bot SD, de Boer MR, van derWindt DA, Knol DL, Dekker J, Bouter
LM, de Vet HC. 2007. Quality criteriawere proposed for measurement propertiesof health status questionnaires. J ClinEpidemiol 60:34–42.
Terwee CB, Mokkink LB, Knol DL, OsteloRW, Bouter LM, de Vet HC. 2012.Rating the methodological quality insystematic reviews of studies on measure-ment properties: A scoring system for theCOSMIN checklist. Qual Life Res21:651–657.
Terzi Y, Akgun K, Aktas I, Palamar D, Can G.2013. The relationship between generalisedjoint hypermobility and adhesive capsulitisof the shoulder. Turk J Rheumatol28:234–241.
Tinkle BT, Bird HA, Grahame R, Lavallee M,Levy HP, Sillence D. 2009. The lack ofclinical distinction between the hyper-mobility type of Ehlers–Danlos syn-drome and the joint hypermobilitysyndrome (a.k.a. hypermobility syn-drome). Am J Med Genet Part A 149A:2368–2370.
Tobias JH, Deere K, Palmer S, Clark EM, Clinch J.2013. Hypermobility is a risk factor formusculoskeletal pain in adolescence: Find-ings from a prospective cohort study.Arthritis Rheuma 65:1107–1115.