BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University...

100
For peer review only Accuracy of routinely recorded ethnic group information compared with self-reported ethnicity: Evidence from the English Cancer Patient Experience survey Journal: BMJ Open Manuscript ID: bmjopen-2013-002882 Article Type: Research Date Submitted by the Author: 14-Mar-2013 Complete List of Authors: Saunders, Catherine; University of Cambridge, Abel, Gary; University of Cambridge, El Turabi, Anas; University of Cambridge, Ahmed, Faraz; University of Cambridge, Lyratzopoulos, Georgios; University of Cambridge, <b>Primary Subject Heading</b>: Health services research Secondary Subject Heading: Health informatics Keywords: Ethnicity, Hospital Records, Quality in health care < HEALTH SERVICES ADMINISTRATION & MANAGEMENT, Equality For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml BMJ Open on August 25, 2021 by guest. Protected by copyright. http://bmjopen.bmj.com/ BMJ Open: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. Downloaded from

Transcript of BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University...

Page 1: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

Accuracy of routinely recorded ethnic group information compared with self-reported ethnicity: Evidence from the

English Cancer Patient Experience survey

Journal: BMJ Open

Manuscript ID: bmjopen-2013-002882

Article Type: Research

Date Submitted by the Author: 14-Mar-2013

Complete List of Authors: Saunders, Catherine; University of Cambridge, Abel, Gary; University of Cambridge, El Turabi, Anas; University of Cambridge,

Ahmed, Faraz; University of Cambridge, Lyratzopoulos, Georgios; University of Cambridge,

<b>Primary Subject Heading</b>:

Health services research

Secondary Subject Heading: Health informatics

Keywords: Ethnicity, Hospital Records, Quality in health care < HEALTH SERVICES ADMINISTRATION & MANAGEMENT, Equality

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open on A

ugust 25, 2021 by guest. Protected by copyright.

http://bmjopen.bm

j.com/

BM

J Open: first published as 10.1136/bm

jopen-2013-002882 on 28 June 2013. Dow

nloaded from

Page 2: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

1

STROBE Statement—checklist of items that should be included in reports of observational studies

Item

No Recommendation

Title and abstract 1 (a) Indicate the study’s design with a commonly used term in the title or the abstract

Study design is stated in title and in the design section of abstract

(b) Provide in the abstract an informative and balanced summary of what was done

and what was found

Please see Methods/Results section of abstract

Introduction

Background/rationale 2 Explain the scientific background and rationale for the investigation being reported

Given in the background section of the paper - specifically – previous studies of the

quality of coding in routine English NHS records have focused on completeness

rather than accuracy and so this work attempts to fill this research gap

Objectives 3 State specific objectives, including any prespecified hypotheses

Specifically stated as the last line of the introduction

Methods

Study design 4 Present key elements of study design early in the paper

Stated in the abstract and further presented in the methods

Setting 5 Describe the setting, locations, and relevant dates, including periods of recruitment,

exposure, follow-up, and data collection

Stated in the abstract and further presented in Methods (Data)

Participants 6 Cross-sectional study—Give the eligibility criteria, and the sources and methods of

selection of participants

Stated in the abstract and further presented in the methods (data) flowchart (figure 1)

Variables 7 Clearly define all outcomes, exposures, predictors, potential confounders, and effect

modifiers. Give diagnostic criteria, if applicable

Please see Methods, analysis (p9, line 4 onwards)

Data sources/

measurement

8* For each variable of interest, give sources of data and details of methods of

assessment (measurement). Describe comparability of assessment methods if there is

more than one group

Please see Methods, analysis (p9 line 4 onwards)

Bias 9 Describe any efforts to address potential sources of bias

Describing measurement bias in hospital record ethnicity coding is an outcome for

this study

In Methods, analysis, we describe our approach to potential clustering by hospital

Study size 10 Explain how the study size was arrived at

Given in Methods, Data and the flowchart in figure 1

Quantitative variables 11 Explain how quantitative variables were handled in the analyses. If applicable,

describe which groupings were chosen and why

All variable groups are explained and described in methods, analysis

Statistical methods 12 (a) Describe all statistical methods, including those used to control for confounding

Described in methods, analysis

(b) Describe any methods used to examine subgroups and interactions

No subgroup analyses considered

(c) Explain how missing data were addressed

Page 1 of 30

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 3: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

2

We performed these analyses (considering the small proportion of cases with missing

ethnicity coding) and mention these results in our discussion. These analyses were

extensive but not the main focus of this paper and so we did not include them as a

substantial research output in this paper

Cross-sectional study—If applicable, describe analytical methods taking account of

sampling strategy

Discussion of the generalizability of the sample was included in the discussion.

Clustering at the hospital level was considered in the analysis, looking at results with

and without a random effect for hospital.

(e) Describe any sensitivity analyses

We repeated the analyses on data from 2011/2, but as information on deprivation for

these years was not recorded and the results were similar across both years we only

present the results for 2010 (including the data from 2011/2 in the probability lookup

table calculation only)

Discordance of ethnicity using ONS 16 and ONS 6 categorisations is also presented

(the latter appendix).

Continued on next page

Page 2 of 30

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 4: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

3

Results

Participants 13* (a) Report numbers of individuals at each stage of study—eg numbers potentially eligible,

examined for eligibility, confirmed eligible, included in the study, completing follow-up, and

analysed

Figure 1

(b) Give reasons for non-participation at each stage

Figure 1

(c) Consider use of a flow diagram

Figure 1

Descriptive

data

14* (a) Give characteristics of study participants (eg demographic, clinical, social) and information

on exposures and potential confounders

Given as part of table 1

(b) Indicate number of participants with missing data for each variable of interest

Methods (data) section and figure 1

Outcome data 15*

Cross-sectional study—Report numbers of outcome events or summary measures

Table 1, given in the text of the results

Main results 16 (a) Give unadjusted estimates and, if applicable, confounder-adjusted estimates and their

precision (eg, 95% confidence interval). Make clear which confounders were adjusted for and

why they were included

Table 1; crude and adjusted results presented with 95% confidence intervals.

(b) Report category boundaries when continuous variables were categorized

Done so for age in table 1.

(c) If relevant, consider translating estimates of relative risk into absolute risk for a meaningful

time period

Not applicable

Other analyses 17 Report other analyses done—eg analyses of subgroups and interactions, and sensitivity

analyses

Results from sensitivity analyses (considering different categorisations of ethnicity) presented

in the appendix.

Discussion

Key results 18 Summarise key results with reference to study objectives

In first paragraph of discussion

Limitations 19 Discuss limitations of the study, taking into account sources of potential bias or imprecision.

Discuss both direction and magnitude of any potential bias

In discussion

Interpretation 20 Give a cautious overall interpretation of results considering objectives, limitations, multiplicity

of analyses, results from similar studies, and other relevant evidence

In discussion

Generalisability 21 Discuss the generalisability (external validity) of the study results

Discussed as in the discussion section of this paper, and also as part of the last bullet point in

‘Article Summary’.

Other information

Funding 22 Give the source of funding and the role of the funders for the present study and, if applicable,

Page 3 of 30

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 5: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

4

for the original study on which the present article is based

In acknowledgements section

*Give information separately for cases and controls in case-control studies and, if applicable, for exposed and

unexposed groups in cohort and cross-sectional studies.

Note: An Explanation and Elaboration article discusses each checklist item and gives methodological background and

published examples of transparent reporting. The STROBE checklist is best used in conjunction with this article (freely

available on the Web sites of PLoS Medicine at http://www.plosmedicine.org/, Annals of Internal Medicine at

http://www.annals.org/, and Epidemiology at http://www.epidem.com/). Information on the STROBE Initiative is

available at www.strobe-statement.org.

Page 4 of 30

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 6: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

1

Accuracy of routinely recorded ethnic group information

compared with self-reported ethnicity: Evidence from the

English Cancer Patient Experience survey

Saunders CL (1)

Abel GA (1)

El Turabi A (1)

Ahmed F (1)

Lyratzopoulos G (1)

Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public

Health, Forvie Site, Robinson Way, Cambridge, CB2 0SR

Author for correspondence:

Catherine Saunders

Mail to: [email protected]

Word count: 295

Abstract: 335 words

Total main text: 2,621

Page 5 of 30

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 7: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

2

ABSTRACT

Objective: To describe the accuracy of ethnicity coding in contemporary NHS hospital

records compared to the ‘gold standard’ of self-reported ethnicity.

Design: Secondary analysis of data from a cross-sectional survey (2011)

Setting: All NHS hospitals in England providing cancer treatment

Participants: 58721 patients with cancer for whom ethnicity information [Office for

National Statistics (ONS) 2001 16-group classification] was available from self-reports

(considered to represent the ‘gold standard’) and their hospital record.

Methods: We calculated the sensitivity and positive predictive value (PPV) of hospital record

ethnicity. Further we used a logistic regression model to explore independent predictors of

discordance between recorded and self-reported ethnicity.

Results: Overall, 4.9% (4.7-5.1%) of people had their self-reported ethnic group incorrectly

recorded in their hospital records. Recorded White British ethnicity had high sensitivity

(97.8% (97.7-98.0%)) and PPV (98.1% (98.0-98.2%)) for self-reported White British ethnicity.

Recorded ethnicity information for the 15 other ethnic groups was substantially less

accurate with 41.2% (39.7-42.7%) incorrect. Recorded ‘Mixed’ ethnicity had low sensitivity

(12%-31%) and PPVs (12%-42%). Recorded ‘Indian’, ‘Chinese’, ‘Black-Caribbean’ and ‘Black

African’ ethnic groups had intermediate levels of sensitivity (65%-80%) and PPV (80%-89%

respectively). In multivariable analysis, belonging to an ethnic minority group was the only

independent predictor of discordant ethnicity information. There was strong evidence that

the degree of discordance of ethnicity information varied substantially between different

hospitals (p<0.0001).

Discussion: Current levels of accuracy of ethnicity information in NHS hospital records

support valid profiling of White / non-White ethnic differences. However profiling of ethnic

Page 6 of 30

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 8: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

3

differences in process or outcome measures for specific minority groups may contain a

substantial and variable degree of misclassification error. These considerations should be

taken into account when interpreting ethnic variation audits based on routine data and

inform initiatives aimed at improving the accuracy of ethnicity information in hospital

records.

Page 7 of 30

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 9: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

4

ARTICLE SUMMARY

Article focus

• Accurate recording of ethnicity in administrative health data is a prerequisite for efforts to

ensure equality or reduce inequalities in health care.

• This paper describes the accuracy of ethnicity coding in English NHS hospital records

compared to the ‘gold standard’ of self-reported ethnicity, and identifies areas where

improvement is needed.

Key messages

• Hospital records will usually code the ethnicity of patients with self-reported White British

ethnic group correctly, but levels of incorrect coding of the ethnicity of all ethnic minority

groups are high

• Belonging to an ethnic minority group is the only independent predictor of having an

incorrectly coded hospital record ethnicity; there is substantial variation in the quality of

coding between hospitals

• Probability look-up tables provided in this paper can be used for weighting of incidence or

prevalence estimates where hospital record ethnicity is being used, or in regression

analysis to improve estimation of ethnic variation in processes or outcomes of care

Strengths and limitations

• This was a unique opportunity to carry out an audit of the accuracy of hospital record

coding of ethnicity in England, using data from a large national survey of recently treated

patients.

Page 8 of 30

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 10: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

5

• We were not able to account for different processes by which ethnicity was

ascertained in hospital records and the study population was skewed towards older

ages which may limit the generalisability of the findings.

Page 9 of 30

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 11: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

6

BACKGROUND

Modern health care systems aspire to equal access and quality of care for patients of any

ethnic group 1,2

. In England, there are nevertheless well-documented ethnic inequalities in

processes and experiences of care 3,4

. Good practice in measuring these inequalities is

needed 5. Therefore, the availability of complete and valid information on patients’ ethnicity

in routine NHS data is a fundamental first step to enable equality audits to inform

improvement actions. Further, there are variations in the incidence and prevalence of some

conditions between different patient ethnic groups. Understanding such variations can be

vital for service planning and required capacity estimates. National Health Service hospitals

began to routinely record ethnicity information in their Patient Administration System (PAS)

records in the mid-1990s 6. (Patient Administration System records are a pre-cursor to

Hospital Episode Statistics (HES) data – hereafter we refer to hospital records as ‘HES

records’ for simplicity, as this is the more commonly used convention to denote patient

administrative data in the NHS). Although implementation of this measure has been slow 7,

HES data in recent years have high completeness of ethnic group information (typically

exceeding 90%) 6,8

. There is however little evidence about the accuracy of routinely recorded

information on patients’ ethnicity. In the past, audits of the quality of ethnicity information

in NHS records have principally focused on data completeness, rather than accuracy 6 and

completeness of ethnicity coding information currently forms part of the Commissioning

Outcomes Data Set 6, a prescribed standard for data quality that is linked to hospital

reimbursement.

Page 10 of 30

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 12: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

7

Ethnicity can be classified by referring to a community of people who share the same

culture, and/or by referring to an ancestral population which comprises their self-identity 9.

Self-reported ethnicity captures both the shared experiences/culture of an individual and

their self-identity. Methods such as surname recognition algorithms or geo-coding (or

combinations of both these approaches) have been developed to indirectly infer the

ethnicity of individuals10–15

, but both have limitations that do not apply to self-reported

ethnicity. For example, geo-coding methods rely on high levels of geographical (residential)

segregation between different ethnic groups. Surname recognition methods rely on low

levels of inter-communal marriages and the existence of distinctive surname nomenclatures

(which do not always exist for some ethnic groups, e.g. Black populations in the United

States). Further, by definition indirect methods will misclassify the ethnicity of some

patients, and disproportionately do so for the ethnicity of people from ethnic minorities. For

all the above reasons self-report is currently considered the gold standard measure of

ethnicity 16,17

.

Patient surveys, when linked to routine ethnicity data, provide opportunities to determine

the accuracy of ethnic group information contained in routine health system records. As this

linkage is rarely done, few studies have cross-examined the accuracy of ethnic group

information included in routine healthcare records against self-reported ethnicity

information. Those that have examined this have been conducted in the USA 18–20

.

Against this background and using data from an English national patient survey we aimed to

examine the overall accuracy of recorded versus self-reported ethnicity; and whether

discordance between recorded and self-reported ethnicity information varied between

patients of different ethnic groups.

Page 11 of 30

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 13: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

8

METHODS

Data

We used publicly available anonymous data from the 2010 Cancer Patient Experience Survey

(CPES) in England 21

. An unusual feature of this survey is that hospital recorded ethnicity was

collected when compiling the sample and linked to patient responses by the survey provider.

All patients treated for cancer in an English NHS hospital during the first quarter of 2010

were invited to participate in the survey, with a response rate of 67% 21

. Of 67713

respondents, 64418 (94.7%) provided valid self-reported ethnicity information. Because self-

reported ethnicity was used as the gold-standard against which the accuracy of HES-

recorded ethnicity was compared, data on respondents with missing self-reported ethnicity

were excluded from further analysis (see figure 1). Data on an additional 348 respondents

with missing information on deprivation status were also excluded from further analysis,

leaving 63,770 respondents. An additional 5049 records with missing HES-recorded ethnicity

were excluded, leaving 58721 records for analysis. For all respondents included in our

analyses completely observed data were available for patients’ HES-recorded age, gender

and socio-economic status (using Index of Multiple Deprivation (IMD) score of lower super

output area of residence). In both HES-records and patient reports, ethnicity was classified

using the same 16-group categorisation [Office of National Statistics (ONS16) 2001]

(Appendix 1).

Analysis

We first described the overall degree of discordance between HES-recorded and self-

reported information on ethnicity using the Cohen’s Kappa statistic. We further calculated

the sensitivity and the positive predictive value (PPV) of HES-recorded ethnicity in respect of

self-reported ethnicity. In this context, sensitivity denotes the proportion of patients with a

Page 12 of 30

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 14: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

9

given self-reported ethnicity with concordant ethnicity information in their HES record; and

PPV denotes the probability that a patient with a particular HES-recorded ethnic group will

self-report the same ethnic group.

Subsequently, we explored independent predictors of discordant ethnicity information by

constructing a multivariable logistic regression model with concordant/discordant ethnicity

status as the binary outcome variable and self-reported ethnicity as a covariate. Adjustment

was also made for other patient socio-demographics characteristics (age in 10 year age

groups, gender and socio-economic status). For age and gender we used HES-recorded

information because the completeness of these variables among survey respondents was

higher than self-reported age and gender, and as the degree of concordance between HES-

recorded and self-reported age (based on year of birth) and gender were very high (>99.5%

for both).

To explore whether clustering of patients in some groups in hospitals with higher or lower

discordance levels could in part explain the findings, we subsequently repeated the

regression model described above including a random effect for hospital – in addition, this

model allows us to explore the degree of variation in the level of ethnicity information

discordance between different hospitals. Because less detailed classifications of ethnicity are

often used in health research, we also carried out supplementary analysis looking at

discordance when using 6 (as opposed to 16) ethnic groups (i.e. White, Mixed, Asian or Asian

British, Black or Black British, Chinese and Other).

Lastly, we constructed a probability ‘look-up’ table indicating the probability that HES-

recorded ethnicity represents true (self-reported) ethnicity for each of the 16 different

ethnic groups. These probabilities can be used for weighting of incidence or prevalence

estimates or used in regression analysis to improve estimation of ethnic variation in

processes or outcomes of care 14,22

. For the calculation of the probability `look-up’ table only,

Page 13 of 30

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 15: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

10

we combined data from two surveys (2010 and 2011/12) to improve the precision of the

probabilities presented (increasing our sample size to 133,204). STATA 11 was used for all

analyses.

Page 14 of 30

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 16: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

11

RESULTS

Overall the frequency of discordance of HES-recorded and self-reported ethnicity

information was 4.9% (4.7-5.1%). Patients who identified themselves as White British had

the lowest frequency of discordant ethnicity information in their HES records (2.2% (2.0-

2.3%)). In contrast, patients who identified themselves as belonging to any other ethnic

group had substantially higher frequency of discordant HES-recorded ethnicity (41.2% (39.7-

42.7%)). Cohen’s Kappa for HES-recorded and self-reported ethnicity was 0.64 overall and

0.54 if White British patients were excluded. The frequency of discordance was particularly

high for patients who self-reported that they belonged to the Any Other Black Background

(90% (97.4-100%)) and the Any Other Mixed Background (87.8% (78.2-97.3%)) groups (Table

1).

HES-recorded ‘White British’ ethnicity had high sensitivity 97.8 (97.7-98) and PPV 98.1 (98-

98.2) for self-reported White British ethnicity (Figure 2, estimates and confidence intervals in

appendix table 2). In contrast, HES-recorded ‘Mixed’ ethnicity had very low sensitivities

(12%-31%) and PPVs (12%-42%). HES-recorded ‘Indian’, ‘Pakistani’, ‘Bangladeshi’, ‘Chinese’,

‘Black-Caribbean’ and ‘Black African’ ethnicity had intermediate levels of sensitivity (65% -

80%) and PPV (80%- 89% respectively). HES-recorded ‘White Irish’ ethnicity had low

sensitivity ( 47.8% (44.5-51.0%)) but high PPV (81.5% (77.9-84.6%)). This means that of all

individuals who self-identify themselves as White Irish only 48% would have their ethnic

group recorded as such in their HES records; however, among patients whose HES records

indicate that they are ‘White Irish’, 82% would identify themselves to belong to this group

too.

Whilst crudely there was some evidence that age, gender and deprivation are associated

with discordance of ethnicity information, in multivariable logistic regression analysis

Page 15 of 30

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 17: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

12

adjusting for other patient characteristics the sole independent predictor of discordance was

self-reported ethnicity (Table 1). Repeating this model with a random effect for hospital

produced only trivial differences to associations of discordance with patient characteristics.

This means that the association between discordance and self-reported ethnic minority

group cannot be explained by ethnic minority patients attending hospitals that have overall

poor levels of accuracy of ethnicity information. There was however strong evidence

(p<0.0001) of variation in discordance between different hospitals. Specifically, if all

hospitals were to be arranged in order of frequency of discordance of ethnicity information

among their HES records, the odds ratio of discordance between the hospital in the 97.5th

and 2.5th

centiles (the 95% reference range) will be about 13, indicating a 13-fold difference

in the odds of discordance across hospitals.

Similar findings to those observed in the main analysis were observed when using a 6-group

classification (Appendix 2) indicating that a large degree of discordance is between rather

than within the 6-groups used in the cruder classification.

Finally, we provide probability tables (Table 2) that can be used in estimating ethnic group

variations in incidence, prevalence or measures care quality when using HES data. Examples

of such applications in the context of US health research have been reported previously 14,22

.

Page 16 of 30

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 18: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

13

DISCUSSION

Using data from a recent national survey of hospital patients with cancer we explored the

accuracy of ethnicity information in HES data. We found that overall the level of discordance

of ethnicity information between HES records and patient self-reports is low, particularly for

the majority White British patients. There is however a substantial degree of inaccuracy in

the recorded ethnicity of patients who self-report themselves as belonging to ethnic

minority groups. For many major ethnic groups (‘Indian’, ‘Pakistani’, ‘Bangladeshi’, ‘Chinese’,

‘Black-Caribbean’ and ‘Black African’) routine hospital data will mis-code between 20 and

35% of all patients who self-report that they belong to these ethnic groups (sensitivity 65% -

80%). Further, up to 20% of patients recorded as belonging to some major ethnic groups will

self-report that they actually belong to other ethnic groups (PPV 80%- 89% respectively). For

patients who self-report being of mixed ethnic groups HES records are usually discordant.

We provide probability tables that can be used in re-estimating ethnic group variations when

investigating ethnic variation in processes or quality of care from UK hospital records in

order to improve precision of such estimates.

The study explored a unique opportunity provided by patient survey data to explore the

accuracy of ethnicity information in UK routine health care data. Previous evidence on the

accuracy of ethnicity information in administrative datasets relates to US settings 18–20

. Our

findings concord with previous US literature which also indicates that the accuracy of

ethnicity information tends to be highest for the majority white population, lower for major

ethnic minority groups (like African American or Hispanic) and lowest for smaller ethnic

groups (such as American Indian/Alaskan Natives) and Mixed or Other racial/ethnic groups

18,19. Other strengths of the study include its large sample size, enabling the profiling of

discordance for small ethnic group; and the use of regression analyses to explore

Page 17 of 30

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 19: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

14

independent predictors of discordance and to examine variation between different

hospitals.

A limitation is that the study population included patients who attended hospital for cancer

treatment (most of whom are aged 65 years of age or older). Therefore in principle the

generalisability of the findings (particularly regarding younger patients) might be limited. We

were also unable to explore potential inaccuracies in ethnicity categorisation resulting from

longitudinal person-level discordance among patients with more than one hospital care

episodes. Previous research indicates that up to 3% of patients with more than one episode

of care have longitudinally discordant ethnic group information 23

. Evolving socio-cultural

trends, or changes in Census methodology (such as the introduction of Mixed ethnic groups

to the UK census in 2001) could contribute to changes in a person’s self-identified ethnic

group over their lifetime. Lastly, we were not able to account for the process by which

ethnicity was ascertained in hospital records. It is likely that this process involves a

combination of self-reports, reports by relatives or carers (e.g. in the case of infirm patients,

or patients with language or other communication difficulties) or even guess work by

hospital staff (e.g. when there are language barriers, or in clinical emergencies) 16

.

The 2001 Census has expanded the previous ONS classification of ethnicity (originally

containing 10-groups) to the 16-group classification also used in this survey 2, a change

which was subsequently reflected into HES coding. The impact of this secular change may

explain some of the discordance observed. For example, the discordance in self-reported

Irish White ethnicity may reflect the fact that this ethnic group was only included in the 2001

classification. The 2011 Census included two new ethnic groups; ‘Gypsy or Irish Traveller’

and ‘Arab’24

. It remains to be seen if this change will be reflected into the routinely collected

health data.

Page 18 of 30

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 20: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

15

Whilst this study considers the accuracy of recorded ethnicity, there are issues with defining

ethnicity using any simple classification, for example the potential for ‘concealed

heterogeneity’ within each of the ethnic groups 16

. It is possible for example that within the

Indian ethnic group certain social, linguistic or religious subgroups which are more or less

likely to have a discordant ethnicity. Indeed evidence indicates that ethnic minority patients

with discordant ethnicity information may be systematically different to other patients from

the same ethnic group with concordant ethnicity 20

. Within-ethnic group heterogeneity is

complex issue inherent in any type of ethnicity research, and the nature of our study did not

allow us to address this.

Another issue we have not addressed in this paper is the completeness of ethnicity

information, which has the potential to bias estimates of ethnic variation in addition to the

misclassification detailed here. We performed similar analysis to those discussed here for

predictors of missing ethnicity (not shown). We found only small variations by self-reported

ethnic group but a much larger degree of variation between hospitals than that seen for

discordance implying the issues with missing ethnicity is primarily driven by hospital level

processes.

Our findings have implications for policy and future research. First, although currently HES

data have a high level of completeness of ethnic information, a degree of caution is required

when interpreting evidence on ethnic inequalities in care quality or disease incidence and

prevalence that is based solely on HES data, particularly for minority ethnic groups found to

have higher discordance rates. Misclassification of ethnicity in HES data could result in either

under-estimation of ethnic variation, or inability to detect such variation when it exists (‘type

1’ error).

Second, we provide a list of probabilities that can be used to improve estimates of ethnic

variation in health care in a UK setting 14,22

. These probabilities can be used both to improve

Page 19 of 30

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 21: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

16

the precision of prevalence estimates and in statistical models where hospital record ethnic

group is a predictor 14

.

Third, as the completeness of ethnicity information in hospital records is currently high,

more attention needs to be given to the accuracy of recorded information. The substantial

variation between hospitals in the accuracy of ethnicity information indicates that there is

great potential for improving the quality of ethnicity information in poorly performing

hospitals. Improvement in the quality of HES data (which is generally desirable 25

) should

also encompass improvements in the quality of ethnicity coding. Future research should

explore optimal ways for efficiently obtaining current self-reported information on ethnicity

in patient records.

Page 20 of 30

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 22: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

17

Contributors: All authors had substantial input into the study concept, design, analysis and

access to the data. GL had the original idea for the study which was further conceptualised in

discussions between all authors; methods were principally developed by CLS and further

amplified by GAA. All authors communicated frequently during the course of the project,

including through face-to-face meetings. CLS and GL wrote the first draft which was edited by

all authors over multiple versions. All authors saw and approved the final manuscript.

Competing interests: None.

Acknowledgement: We thank the UK Data Archive for access to the anonymous survey data

(UKDA study number: 69488), the Department of Health as the depositor and principal

investigator of the Cancer Patient Experience Survey 2010, Quality Health as data collector;

and all NHS Acute Trusts in England, for provision of data samples.

Ethics approval: Not required.

Funding statement: The paper is independent research arising from a Post-Doctoral

Fellowship award to GL supported by the National Institute for Health Research (PDF-2011-04-

047). The views expressed in this publication are those of the authors and not necessarily

those of the NHS, the National Institute for Health Research or the Department of Health.

Data sharing: Not applicable (the data are available for academic research from the UK Data

Archive – see acknowledgement).

STROBE: Checklist for an observational study: Attached separately.

Page 21 of 30

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 23: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

18

REFERENCES

1. US Department of Health and Human Services Agency for Healthcare Research and Quality 2009 National

Healthcare Disparities and Quality Reports. (2009).at <http://www.ahrq.gov/qual/qrdr09.htm>

2. Department of Health Equity and excellence: liberating the NHS (White Paper). (2010).at

<http://www.dh.gov.uk/en/Publicationsandstatistics/Publications/PublicationsPolicyAndGuidance/DH_11735

3>

3. Cancer Incidence and Survival By Major Ethnic Group, England, 2002 - 2006. (2009).at

<http://www.ncin.org.uk/publications/reports/default.aspx>

4. Mindell, J., Klodawski, E. & Fitzpatrick, J. Using routine data to measure ethnic differentials in access to

coronary revascularization. Journal of public health (Oxford, England) 30, 45–53 (2008).

5. Mir, G. et al. Principles for research on ethnicity and health: the Leeds Consensus Statement. European

journal of public health cks028– (2012).doi:10.1093/eurpub/cks028

6. The NHS Information Centre How good is HES ethnic coding and where do the problems lie? (2011).at

<http://www.hesonline.nhs.uk/Ease/servlet/ContentServer?siteID=1937&categoryID=171>

7. Szczepura, A. Access to health care for ethnic minority populations. Postgraduate medical journal 81, 141–7

(2005).

8. Jack, R. H., Linklater, K. M., Hofman, D., Fitzpatrick, J. & Møller, H. Ethnicity coding in a regional cancer

registry and in Hospital Episode Statistics. BMC public health 6, 281 (2006).

9. Isajiw, W. W. Definition and Dimensions of Ethnicity: a theoretical framework. Challenges of Measuring an

Ethnic World: Science, politics and reality: Proceedings of the Joint Canada-United States Conference on the

Measurement of Ethnicity April 1-3, 1992, 407–27at

<https://tspace.library.utoronto.ca/bitstream/1807/68/2/Def_DimofEthnicity.pdf>

10. Lyratzopoulos, G., McElduff, P., Heller, R. F., Hanily, M. & Lewis, P. S. Comparative levels and time trends in

blood pressure, total cholesterol, body mass index and smoking among Caucasian and South-Asian

participants of a UK primary-care based cardiovascular risk factor screening programme. BMC public health 5,

125 (2005).

11. Shah, B. R. et al. Surname lists to identify South Asian and Chinese ethnicity from secondary data in Ontario,

Canada: a validation study. BMC medical research methodology 10, 42 (2010).

12. Maringe, C. et al. Cancer incidence in South Asian migrants to England, 1986-2004: Unraveling ethnic from

socioeconomic differentials. International journal of cancer. Journal international du cancer 132, 1886–94

(2013).

13. Quan, H. et al. Development and validation of a surname list to define Chinese ethnicity. Medical care 44,

328–33 (2006).

14. Elliott, M. N., Fremont, A., Morrison, P. A., Pantoja, P. & Lurie, N. A new method for estimating race/ethnicity

and associated disparities where administrative records lack self-reported race/ethnicity. Health services

research 43, 1722–36 (2008).

Page 22 of 30

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 24: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

19

15. Nitsch, D. et al. Validation and utility of a computerized South Asian names and group recognition algorithm

in ascertaining South Asian ethnicity in the national renal registry. QJM : monthly journal of the Association of

Physicians 102, 865–72 (2009).

16. Aspinall, P. J. The utility and validity for public health of ethnicity categorization in the 1991, 2001 and 2011

British Censuses. Public health 125, 680–7 (2011).

17. Mays, V. M., Ponce, N. A., Washington, D. L. & Cochran, S. D. Classification of race and ethnicity: implications

for public health. Annual review of public health 24, 83–110 (2003).

18. Arday, S. L., Arday, D. R., Monroe, S. & Zhang, J. HCFA’s racial and ethnic data: current accuracy and recent

improvements. Health care financing review 21, 107–16 (2000).

19. Lee, L. M., Lehman, J. S., Bindman, A. B. & Fleming, P. L. Validation of race/ethnicity and transmission mode in

the US HIV/AIDS reporting system. American journal of public health 93, 914–7 (2003).

20. Zaslavsky, A. M., Ayanian, J. Z. & Zaborski, L. B. The validity of race and ethnicity in enrollment data for

Medicare beneficiaries. Health services research 47, 1300–21 (2012).

21. Department of Health National Cancer Patient Experience Survey 2010 [computer file]. UKDA-SN-6742-1 .

Colchester, Essex: UK Data Archive [distributor] at <http://dx.doi.org/10.5255/UKDA-SN-6742-1 >

22. McCaffrey, D. F. & Elliott, M. N. Power of tests for a dichotomous independent variable measured with error.

Health services research 43, 1085–101 (2008).

23. Georghiou, T. & Thorlby, R. Quality of ethnicity coding in Hospital Episode Statistics: beyond completeness.

Paper presented at Healthcare Commission seminar “Measuring what matters: measuring ethnicity in health

data” 12/02/2007 . (2007).

24. Office of National Statistics 2011 Census. at <http://www.ons.gov.uk/ons/guide-

method/census/2011/index.html>

25. Spencer, S. A. & Davies, M. P. Hospital episode statistics: improving the quality and value of hospital data: a

national internet e-survey of hospital consultants. BMJ open 2, (2012).

Page 23 of 30

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 25: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

20

Figure 1. Survey responders and exclusions

67713

64118

63770

58721

Excluding patients with missing self-reported

ethnicity

Excluding patients with missing age, gender and

Index of Multiple Deprivation (IMD) score

Excluding patients with missing HES recorded

ethnicity

101773

surveys sent

67% response rate

Page 24 of 30

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 26: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

21

Total

respondants

Total discordant Crude OR of

discordance

joint p-

value

Adjusted OR joint p-

value

n % n %

Age

16-24 399 0.7 30 7.5 1.77 (1.28-2.46) <0.0001 0.9 (0.55-1.48) 0.54

25-34 950 1.6 93 9.8 2.37 (1.86-3.01) 0.94 (0.71-1.26)

35-44 3,139 5.3 231 7.4 1.73 (1.51-1.99) 1.02 (0.84-1.23)

45-54 7,763 13.2 484 6.2 1.45 (1.3-1.61) 1.12 (0.97-1.29)

55-64 15,375 26.2 743 4.8 1.11 (0.99-1.23) 1.11 (0.98-1.25)

65-74 18,366 31.3 805 4.4 reference reference

75-84 10,747 18.3 412 3.8 0.87 (0.76-0.99) 0.99 (0.86-1.14)

85+ 1,982 3.4 80 4.0 0.92 (0.71-1.18) 1.08 (0.83-1.42)

Gender

Men 27,395 46.7 1,268 4.6 reference 0.014 reference 0.69

Women 31,326 53.3 1,610 5.1 1.12 (1.02-1.22) 1.02 (0.93-1.12)

Deprivation

Lowest 13,301 22.7 488 3.7 reference <0.0001 reference 0.69

2 13,432 22.9 509 3.8 1.03 (0.91-1.18) 1.04 (0.9-1.2)

3 12,498 21.3 559 4.5 1.23 (1.05-1.44) 0.99 (0.85-1.14)

4 10,756 18.3 652 6.1 1.69 (1.36-2.11) 1.03 (0.89-1.2)

Highest 8,734 14.9 670 7.7 2.18 (1.68-2.84) 0.94 (0.8-1.11)

Ethnic Group

British (White) 54,589 93.0 1,177 2.2 reference <0.0001 reference <0.0001

Irish (White) 938 1.6 490 52.2 49.63 (32.72-75.28) 47.73 (41-55.55)

Any other White background 1,066 1.8 485 45.5 37.88 (24.95-57.52) 36.49 (31.52-42.25)

White and Black Caribbean

(Mixed)

72 0.1 50 69.4 103.14 (55.73-190.86) 109.97 (64.65-187.04)

White and Black African (Mixed) 41 0.1 33 80.5 187.19 (90.23-388.35) 202.59 (90.28-454.64)

White and Asian (Mixed) 77 0.1 65 84.4 245.81 (125.89-

479.94)

283.62 (149.43-

538.33)

Any other Mixed background 49 0.1 43 87.8 325.22 (124.17-

851.84)

399.68 (166.26-

960.79)

Indian (Asian or Asian British) 516 0.9 101 19.6 11.04 (7.16-17.04) 8.06 (6.33-10.27)

Pakistani (Asian or Asian British) 206 0.4 46 22.3 13.05 (8.39-20.28) 10.26 (7.19-14.66)

Bangladeshi (Asian or Asian

British)

50 0.1 13 26.0 15.94 (7.44-34.18) 12.02 (6.17-23.41)

Any other Asian background 129 0.2 57 44.2 35.93 (20.28-63.63) 30.42 (20.94-44.2)

Caribbean (Black or Black British) 471 0.8 136 28.9 18.42 (12.64-26.85) 10.37 (8.21-13.1)

African (Black or Black British) 319 0.5 111 34.8 24.22 (16.97-34.56) 15.54 (11.9-20.28)

Any other Black background 10 0.0 9 90.0 408.42 (49.3-3383.53) 410.23 (49.73-

3384.32)

Chinese (other ethnic group) 124 0.2 30 24.2 14.48 (8.77-23.92) 10.66 (6.86-16.57)

Any other ethnic group 64 0.1 32 50.0 45.38 (27.55-74.75) 34.04 (20.05-57.79)

Hospital

OR 95% Reference Range 158 - 30.48 <0.0001 12.85 <0.0001

Table 1. Crude and adjusted predictors of discordant hospital record ethnicity coding

Page 25 of 30

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 27: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

22

White British

Irish

Other White

White & Black Caribbean

White & Black AfricanWhite & Asian

Other Mixed

IndianPakistani

Bangladeshi

Other Asian

Caribbean

African

Other Black

Chinese

Other

020

40

60

80

100

Sensitivity %

0 20 40 60 80 100Positive predictive value %

Figure 2. Sensitivity* and positive predictive value** of hospital record recorded ethnicity compared with

self-reported ethnicity as a gold-standard

*If a patient self-reports that they belong to a particular ethnic group, then the sensitivity of the hospital record

ethnicity coding is the probability that the hospital record will record the same (correct) ethnicity

**If a patient’s hospital record states that they belong to a particular ethnic group, then the positive predictive value of

the hospital record ethnicity code is the probability that this code has been recorded correctly and that the patient will

self-report the same ethnicity

Page 26 of 30

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 28: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

23

Probability (%) of true self-reported ethnic group for each HES code

HES code British

(White)

Irish

(White)

Any other

White

background

White and

Black

Caribbean

(Mixed)

White

and

Black

African

(Mixed)

White

and

Asian

(Mixed)

Any other

Mixed

background

Indian

(Asian

or Asian

British)

Pakistani

(Asian or

Asian

British)

Bangladeshi

(Asian or

Asian British)

Any other

Asian

background

Caribbean

(Black or

Black

British)

African

(Black

or Black

British)

Any other

Black

background

Chinese

(other

ethnic

group)

Any

other

ethnic

group

A 98.1 0.9 0.7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.0 0.0

B 20.1 78.7 1.0 0.0 0.0 0.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

C 53.1 1.2 42.1 0.2 0.1 0.5 0.7 0.2 0.0 0.0 0.5 0.1 0.3 0.1 0.2 0.8

D 11.2 0.0 2.8 38.3 0.0 0.0 2.8 0.0 0.0 0.0 0.0 41.1 2.8 0.9 0.0 0.0

E 12.2 0.0 6.1 4.1 26.5 0.0 8.2 0.0 0.0 0.0 0.0 8.2 34.7 0.0 0.0 0.0

F 25.0 0.0 2.8 0.0 0.0 40.3 2.8 8.3 1.4 1.4 13.9 1.4 0.0 0.0 1.4 1.4

G 30.6 0.9 10.8 4.5 4.5 7.2 12.6 4.5 0.9 0.9 3.6 5.4 2.7 0.9 4.5 5.4

H 1.9 0.0 0.4 0.0 0.1 0.6 0.1 88.4 2.8 0.4 4.6 0.2 0.1 0.1 0.0 0.1

J 2.3 0.0 0.8 0.0 0.0 0.5 0.3 5.9 87.7 1.0 1.0 0.0 0.0 0.0 0.0 0.5

K 2.1 0.0 0.0 0.0 0.0 1.1 1.1 5.3 2.1 82.1 4.2 0.0 1.1 0.0 1.1 0.0

L 4.4 0.0 4.2 0.2 0.5 3.7 3.0 21.5 7.6 0.9 41.0 0.5 0.2 0.7 3.5 8.1

M 4.2 0.0 0.1 3.6 0.5 0.0 0.6 0.0 0.0 0.0 0.4 84.8 3.6 2.0 0.0 0.1

N 5.0 0.2 1.4 0.6 2.6 0.0 0.6 0.0 0.0 0.0 0.2 6.6 81.3 1.4 0.0 0.2

P 3.1 0.0 0.9 3.6 2.7 0.9 1.8 0.4 0.0 0.0 1.3 38.1 42.6 4.0 0.4 0.0

R 4.2 0.0 0.8 0.0 0.0 0.8 0.0 0.4 0.0 0.0 4.2 0.0 0.0 0.0 83.2 6.3

S 43.5 0.6 17.3 0.4 0.9 1.6 1.8 3.5 1.1 0.6 9.1 2.5 3.4 0.9 1.6 11.3

Z 90.6 1.5 2.8 0.1 0.1 0.3 0.2 0.9 0.5 0.0 0.7 1.0 0.6 0.1 0.3 0.3

X or Missing 91.3 1.3 2.5 0.1 0.1 0.2 0.2 1.3 0.4 0.1 0.5 0.7 0.7 0.1 0.4 0.2

Table 2. The probability (%) of ‘true’ ethnic group for each hospital record-recorded ethnic group (i.e. the probability that a given person with a specific HES-recorded

ethnicity belongs to any (self-reported) group

Page 27 of 30

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. Protected by copyright. http://bmjopen.bmj.com/ BMJ Open: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. Downloaded from

Page 29: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

24

From 2001-02 onwards: A = British (White)

B = Irish (White)

C = Any other White background

D = White and Black Caribbean (Mixed)

E = White and Black African (Mixed)

F = White and Asian (Mixed)

G = Any other Mixed background

H = Indian (Asian or Asian British)

J = Pakistani (Asian or Asian British)

K = Bangladeshi (Asian or Asian British)

L = Any other Asian background

M = Caribbean (Black or Black British)

N = African (Black or Black British)

P = Any other Black background

R = Chinese (other ethnic group)

S = Any other ethnic group

Z = Not stated

X = Not known

From 1995-96 to 2000-01: 0= White

1 = Black - Caribbean

2 = Black - African

3 = Black - Other

4 = Indian

5 = Pakistani

6 = Bangladeshi

7 = Chinese

8 = Any other ethnic group

9 = Not given

X = Not known

Appendix table 1. NCPES questionnaire ethnicity question (left) and HES ethnicity codes (right)

Page 28 of 30

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 30: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

25

Sensitivity of

hospital-recorded

Information

PPV of

hospital-recorded

Information

Ethnicity in 2001 Census groups (16)

British (White) 97.8 (97.7-98) 98.1 (98-98.2)

Irish (White) 47.8 (44.5-51) 81.5 (77.9-84.6)

Any other White background 54.5 (51.5-57.5) 39.3 (36.8-41.8)

White and Black Caribbean (Mixed) 30.6 (20.2-42.5) 41.5 (28.1-55.9)

White and Black African (Mixed) 19.5 (8.8-34.9) 34.8 (16.4-57.3)

White and Asian (Mixed) 15.6 (8.3-25.6) 38.7 (21.8-57.8)

Any other Mixed background 12.2 (4.6-24.8) 12 (4.5-24.3)

Indian (Asian or Asian British) 80.4 (76.7-83.8) 89.1 (85.9-91.7)

Pakistani (Asian or Asian British) 77.7 (71.4-83.2) 86 (80.2-90.7)

Bangladeshi (Asian or Asian British) 74 (59.7-85.4) 80.4 (66.1-90.6)

Any other Asian background 55.8 (46.8-64.5) 38.3 (31.3-45.7)

Caribbean (Black or Black British) 71.1 (66.8-75.2) 85.2 (81.3-88.6)

African (Black or Black British) 65.2 (59.7-70.4) 80.3 (74.9-85)

Any other Black background 10 (0.3-44.5) 0.9 (0-5)

Chinese (other ethnic group) 75.8 (67.3-83) 87.9 (80.1-93.4)

Appendix table 2 Sensitivity and PPV. Estimates and 95% CI

Page 29 of 30

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 31: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

26

Total %

N discordant

(ONS6) %

Crude OR of

discordance

joint p-

value

Adjusted OR of

discordance

joint p-

value

Age

16-24 399 0.7 17 4.3 4.15 (2.47-6.96)

<0.0001

0.87 (0.43-1.79)

0.10

25-34 950 1.6 37 3.9 3.78 (2.63-5.42) 1.23 (0.77-1.94)

35-44 3,139 5.3 83 2.6 2.53 (1.99-3.22) 1.32 (0.96-1.82)

45-54 7,763 13.2 166 2.1 2.04 (1.68-2.46) 1.26 (0.98-1.62)

55-64 15,375 26.2 200 1.3 1.23 (1.05-1.44) 1.16 (0.93-1.46)

65-74 18,366 31.3 195 1.1 reference reference

75-84 10,747 18.3 86 0.8 0.75 (0.58-0.98) 0.84 (0.63-1.12)

85+ 1,982 3.4 12 0.6 0.57 (0.3-1.09) 0.73 (0.39-1.39)

Gender

Men 27,395 46.7 321 1.2 reference 0.0003

reference 0.49

Women 31,326 53.3 475 1.5 1.3 (1.13-1.49) 1.06 (0.9-1.26)

Deprivation

Lowest 13,301 22.7 119 0.9 reference

<0.0001

reference

0.30

2 13,432 22.9 120 0.9 1 (0.75-1.34) 1.13 (0.84-1.52)

3 12,498 21.3 143 1.1 1.28 (0.92-1.78) 1.23 (0.92-1.64)

4 10,756 18.3 191 1.8 2 (1.44-2.79) 1.36 (1.02-1.8)

Highest 8,734 14.9 223 2.6 2.9 (2.14-3.93) 1.27 (0.94-1.7)

Ethnic Group

White 56,593 96.4 332 0.6 baseline

<0.0001

baseline

<0.0001

Mixed 239 0.4 179 74.9 505.56 (333.91-765.45) 564.17 (399.47-796.78)

Asian 901 1.5 100 11.1 21.16 (14.5-30.88) 14.3 (10.97-18.65)

Black 800 1.4 123 15.4 30.79 (19.65-48.23) 17.51 (13.35-22.96)

Chinese 124 0.2 30 24.2 54.08 (32.35-90.42) 38.62 (24.39-61.14)

Other 64 0.1 32 50.0 169.46 (103.63-277.12) 109.64 (63.49-189.34)

Hospital

OR 95% Reference

Range 158 - 73.97 <0.0001 22.96 <0.0001

Appendix table 3. Crude and adjusted predictors of discordant hospital record ethnicity coding (based on 6

combined ethnicity groups)

Page 30 of 30

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 32: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

Accuracy of routinely recorded ethnic group information compared with self-reported ethnicity: Evidence from the

English Cancer Patient Experience survey

Journal: BMJ Open

Manuscript ID: bmjopen-2013-002882.R1

Article Type: Research

Date Submitted by the Author: 02-May-2013

Complete List of Authors: Saunders, Catherine; University of Cambridge, Abel, Gary; University of Cambridge, El Turabi, Anas; University of Cambridge,

Ahmed, Faraz; University of Cambridge, Lyratzopoulos, Georgios; University of Cambridge,

<b>Primary Subject Heading</b>:

Health services research

Secondary Subject Heading: Health informatics, Oncology

Keywords: Ethnicity, Hospital Records, Quality in health care < HEALTH SERVICES ADMINISTRATION & MANAGEMENT, Equality

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open on A

ugust 25, 2021 by guest. Protected by copyright.

http://bmjopen.bm

j.com/

BM

J Open: first published as 10.1136/bm

jopen-2013-002882 on 28 June 2013. Dow

nloaded from

Page 33: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

1

Accuracy of routinely recorded ethnic group information

compared with self-reported ethnicity: Evidence from the

English Cancer Patient Experience survey

Saunders CL (1)

Abel GA (1)

El Turabi A (1)

Ahmed F (1)

Lyratzopoulos G (1)

Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public

Health, Forvie Site, Robinson Way, Cambridge, CB2 0SR

Author for correspondence:

Catherine Saunders

Mail to: [email protected]

Word count: 295

Abstract: 335 words

Total main text: 2,915

Page 1 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 34: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

2

ABSTRACT

Objective: To describe the accuracy of ethnicity coding in contemporary NHS hospital

records compared to the ‘gold standard’ of self-reported ethnicity.

Design: Secondary analysis of data from a cross-sectional survey (2011)

Setting: All NHS hospitals in England providing cancer treatment

Participants: 58721 patients with cancer for whom ethnicity information [Office for

National Statistics (ONS) 2001 16-group classification] was available from self-reports

(considered to represent the ‘gold standard’) and their hospital record.

Methods: We calculated the sensitivity and positive predictive value (PPV) of hospital record

ethnicity. Further we used a logistic regression model to explore independent predictors of

discordance between recorded and self-reported ethnicity.

Results: Overall, 4.9% (4.7-5.1%) of people had their self-reported ethnic group incorrectly

recorded in their hospital records. Recorded White British ethnicity had high sensitivity

(97.8% (97.7-98.0%)) and PPV (98.1% (98.0-98.2%)) for self-reported White British ethnicity.

Recorded ethnicity information for the 15 other ethnic groups was substantially less

accurate with 41.2% (39.7-42.7%) incorrect. Recorded ‘Mixed’ ethnicity had low sensitivity

(12%-31%) and PPVs (12%-42%). Recorded ‘Indian’, ‘Chinese’, ‘Black-Caribbean’ and ‘Black

African’ ethnic groups had intermediate levels of sensitivity (65%-80%) and PPV (80%-89%

respectively). In multivariable analysis, belonging to an ethnic minority group was the only

independent predictor of discordant ethnicity information. There was strong evidence that

the degree of discordance of ethnicity information varied substantially between different

hospitals (p<0.0001).

Discussion: Current levels of accuracy of ethnicity information in NHS hospital records

support valid profiling of White / non-White ethnic differences. However profiling of ethnic

Page 2 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 35: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

3

differences in process or outcome measures for specific minority groups may contain a

substantial and variable degree of misclassification error. These considerations should be

taken into account when interpreting ethnic variation audits based on routine data and

inform initiatives aimed at improving the accuracy of ethnicity information in hospital

records.

Page 3 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 36: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

4

ARTICLE SUMMARY

Article focus

• Accurate recording of ethnicity in administrative health data is a prerequisite for efforts to

ensure equality or reduce inequalities in health care.

• This paper describes the accuracy of ethnicity coding in English NHS hospital records

compared to the ‘gold standard’ of self-reported ethnicity, and identifies areas where

improvement is needed.

Key messages

• Hospital records will usually code the ethnicity of patients with self-reported White British

ethnic group correctly, but levels of incorrect coding of the ethnicity of all ethnic minority

groups are high

• Belonging to an ethnic minority group is the only independent predictor of having an

incorrectly coded hospital record ethnicity; there is substantial variation in the quality of

coding between hospitals

• Probability look-up tables provided in this paper can be used for weighting of incidence or

prevalence estimates where hospital record ethnicity is being used, or in regression

analysis to improve estimation of ethnic variation in processes or outcomes of care

Strengths and limitations

• This was a unique opportunity to carry out an audit of the accuracy of hospital record

coding of ethnicity in England, using data from a large national survey of recently treated

patients.

Page 4 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 37: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

5

• We were not able to account for different processes by which ethnicity was

ascertained in hospital records and the study population was skewed towards older

ages which may limit the generalisability of the findings.

Page 5 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 38: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

6

BACKGROUND

Modern health care systems aspire to equal access and quality of care for patients of any

ethnic group 1 2

. In England, there are nevertheless well-documented ethnic inequalities in

processes and experiences of care 3 4

. Good practice in measuring these inequalities is

needed 5. Therefore, the availability of complete and valid information on patients’ ethnicity

in routine NHS data is a fundamental first step to enable equality audits to inform

improvement actions. Further, there are variations in the incidence and prevalence of some

conditions between different patient ethnic groups. Understanding such variations can be

vital for service planning and required capacity estimates. National Health Service hospitals

began to routinely record ethnicity information in their Patient Administration System (PAS)

records in the mid-1990s 6. (Patient Administration System records are a pre-cursor to

Hospital Episode Statistics (HES) data – hereafter we refer to hospital records as ‘HES

records’ for simplicity, as this is the more commonly used convention to denote patient

administrative data in the NHS). Although implementation of this measure has been slow 7,

HES data in recent years have high completeness of ethnic group information (typically

exceeding 90%) 6 8

. There is however little evidence about the accuracy of routinely recorded

information on patients’ ethnicity. In the past, audits of the quality of ethnicity information

in NHS records have principally focused on data completeness, rather than accuracy 6 and

completeness of ethnicity coding information currently forms part of the Commissioning

Outcomes Data Set 6, a prescribed standard for data quality that is linked to hospital

reimbursement.

Ethnicity can be classified by referring to a community of people who share the same

culture, and/or by referring to an ancestral population which comprises their self-identity 9.

Self-reported ethnicity captures both the shared experiences/culture of an individual and

Page 6 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 39: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

7

their self-identity. Methods such as surname recognition algorithms or geo-coding (or

combinations of both these approaches) have been developed to indirectly infer the

ethnicity of individuals10–15

, but both have limitations that do not apply to self-reported

ethnicity. For example, geo-coding methods rely on high levels of geographical (residential)

segregation between different ethnic groups. Surname recognition methods rely on low

levels of inter-ethnic marriages and the existence of distinctive surname nomenclatures

(which do not always exist for some ethnic groups, e.g. Black populations in the United

States). Further, by definition indirect methods will misclassify the ethnicity of some

patients, and disproportionately do so for the ethnicity of people from ethnic minorities. For

all the above reasons self-report is currently considered the gold standard measure of

ethnicity 16 17

.

Patient surveys, when linked to routine ethnicity data, provide opportunities to determine

the accuracy of ethnic group information contained in routine health system records. As this

linkage is rarely done, few studies have cross-examined the accuracy of ethnic group

information included in routine healthcare records against self-reported ethnicity

information. Those that have examined this have been conducted in the USA 18–20

.

Against this background and using data from an English national patient survey we aimed to

examine the overall accuracy of recorded versus self-reported ethnicity; and whether

discordance between recorded and self-reported ethnicity information varied between

patients of different ethnic groups.

Page 7 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 40: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

8

METHODS

Data

We used publicly available anonymous data from the 2010 Cancer Patient Experience Survey

(CPES) in England 21

. An unusual feature of this survey is that hospital recorded ethnicity was

collected when compiling the sample and linked to patient responses by the survey provider.

All patients treated for cancer in an English NHS hospital during the first quarter of 2010

were invited to participate in the survey, with a response rate of 67% 21

. Of 67713

respondents, 64418 (94.7%) provided valid self-reported ethnicity information. Because self-

reported ethnicity was used as the gold-standard against which the accuracy of HES-

recorded ethnicity was compared, data on respondents with missing self-reported ethnicity

were excluded from further analysis (see figure 1). Data on an additional 348 respondents

with missing information on deprivation status were also excluded from further analysis,

leaving 63,770 respondents. An additional 5049 records with missing HES-recorded ethnicity

were excluded, leaving 58721 records for analysis. For all respondents included in our

analyses completely observed data were available for patients’ HES-recorded age, gender

and socio-economic status (using Index of Multiple Deprivation (IMD) score of lower super

output area of residence). In both HES-records and patient reports, ethnicity was classified

using the same 16-group categorisation [Office of National Statistics (ONS16) 2001]

(Appendix 1).

Analysis

We first described the overall degree of discordance between HES-recorded and self-

reported information on ethnicity using the Cohen’s Kappa statistic. We further calculated

the sensitivity and the positive predictive value (PPV) of HES-recorded ethnicity in respect of

self-reported ethnicity. In this context, sensitivity denotes the proportion of patients with a

Page 8 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 41: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

9

given self-reported ethnicity with concordant ethnicity information in their HES record; and

PPV denotes the probability that a patient with a particular HES-recorded ethnic group will

self-report the same ethnic group.

Subsequently, we explored independent predictors of discordant ethnicity information by

constructing a multivariable logistic regression model with concordant/discordant ethnicity

status as the binary outcome variable and self-reported ethnicity as a covariate. Adjustment

was also made for other patient socio-demographics characteristics (age in 10 year age

groups, gender and postcode-linked area-based deprivation). For age and gender we used

HES-recorded information because the completeness of these variables among survey

respondents was higher than self-reported age and gender, and as the degree of

concordance between HES-recorded and self-reported age (based on year of birth) and

gender were very high (>99.5% for both).

To explore whether clustering of patients in some groups in hospitals with higher or lower

discordance levels could in part explain the findings, we subsequently repeated the

regression model described above including a random effect for hospital – in addition, this

model allows us to explore the degree of variation in the level of ethnicity information

discordance between different hospitals. Because less detailed classifications of ethnicity are

often used in health research, we also carried out supplementary analysis looking at

discordance when using 6 (as opposed to 16) ethnic groups (i.e. White, Mixed, Asian or Asian

British, Black or Black British, Chinese and Other).

Lastly, we constructed a probability ‘look-up’ table indicating the probability that HES-

recorded ethnicity represents true (self-reported) ethnicity for each of the 16 different

ethnic groups. These probabilities can be used for weighting of incidence or prevalence

estimates or used in regression analysis to improve estimation of ethnic variation in

processes or outcomes of care 14 22

. For the calculation of the probability `look-up’ table only,

Page 9 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 42: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

10

we combined data from two surveys (2010 and 2011/12) to improve the precision of the

probabilities presented (increasing our sample size to 133,204). STATA 11 was used for all

analyses.

Page 10 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 43: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

11

RESULTS

Overall the frequency of discordance of HES-recorded and self-reported ethnicity

information was 4.9% (4.7-5.1%). Patients who identified themselves as White British had

the lowest frequency of discordant ethnicity information in their HES records (2.2% (2.0-

2.3%)). In contrast, patients who identified themselves as belonging to any other ethnic

group had substantially higher frequency of discordant HES-recorded ethnicity (41.2% (39.7-

42.7%)). Cohen’s Kappa for HES-recorded and self-reported ethnicity was 0.64 overall and

0.54 if White British patients were excluded. The frequency of discordance was particularly

high for patients who self-reported that they belonged to the Any Other Black Background

(90% (97.4-100%)) and the Any Other Mixed Background (87.8% (78.2-97.3%)) groups (Table

1).

HES-recorded ‘White British’ ethnicity had high sensitivity 97.8 (97.7-98) and PPV 98.1 (98-

98.2) for self-reported White British ethnicity (Figure 2, estimates and confidence intervals in

appendix table 2 for the 6 and 16 group classification). In contrast, HES-recorded ‘Mixed’

ethnicity had very low sensitivities (12%-31%) and PPVs (12%-42%). HES-recorded ‘Indian’,

‘Pakistani’, ‘Bangladeshi’, ‘Chinese’, ‘Black-Caribbean’ and ‘Black African’ ethnicity had

intermediate levels of sensitivity (65% -80%) and PPV (80%- 89% respectively). HES-recorded

‘White Irish’ ethnicity had low sensitivity ( 47.8% (44.5-51.0%)) but high PPV (81.5% (77.9-

84.6%)). This means that of all individuals who self-identify themselves as White Irish only

48% would have their ethnic group recorded as such in their HES records; however, among

patients whose HES records indicate that they are ‘White Irish’, 82% would identify

themselves to belong to this group too.

Whilst crudely there was some evidence that age, gender and deprivation are associated

with discordance of ethnicity information, in multivariable logistic regression analysis

Page 11 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 44: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

12

adjusting for other patient characteristics the sole independent predictor of discordance was

self-reported ethnicity (Table 1). Repeating this model with a random effect for hospital

produced only trivial differences to associations of discordance with patient characteristics.

This means that the association between discordance and self-reported ethnic minority

group cannot be explained by ethnic minority patients attending hospitals that have overall

poor levels of accuracy of ethnicity information. There was however strong evidence

(p<0.0001) of variation in discordance between different hospitals (accuracy of coding of

ethnicity across hospitals ranged from 67-100%). Specifically, if all hospitals were to be

arranged in order of frequency of discordance of ethnicity information, and after accounting

for differences in the proportion of ethnic minority patients attending the hospital, the odds

ratio of discordance between the hospital in the 97.5th

and 2.5th

centiles (the 95% reference

range) will be about 13, indicating a 13-fold difference in the odds of discordance across

hospitals. After accounting for differences in the proportion of ethnic minority patients

attending each hospital the hospital at the bottom 2.5th

centile of coding accuracy had

concordant self-reported and hospital record recorded ethnicity codes in 90% of records.

Similar findings to those observed in the main analysis were observed when using a 6-group

classification (table 2) The total numbers of respondents with discordant ethnicity decreased

when using a 6-group classification to 796 subjects (1.4%) from 2878 (4.9%), indicating that

much of the discordance was within the cruder 6 group classification. However self-

reporting a non-White ethnic background remained as strongly associated with having an

incorrectly coded hospital-record ethnic group as for the 16 group classification. Details of

the ethnic groups to which people from each self-reported ethnic background were

misclassified to are given in table 3, showing the variation between and within the 6 and 16

group coding.

Page 12 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 45: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

13

People who self-report mixed ethnic backgrounds are particularly likely to have an

incorrectly coded ethnic group in their hospital records, with numbers high using both the 16

(79.9%) and 6 (74.9%) group classifications. Looking at table 3 we see that the self reported

ethnicity is falls into one of three groups - the concordant mixed category, white (either

British or Other white) or the corresponding ethnic minority category - over 90% of the time,

explaining why for this group the broader 6 category classification does not improve the

accuracy of hospital record ethnicity coding.

Finally, the probability table (Table 3) can be used in estimating ethnic group variations in

incidence, prevalence or measures care quality when using HES data. Examples of such

applications in the context of US health research have been reported previously 14 22

.

Page 13 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 46: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

14

DISCUSSION

Using data from a recent national survey of hospital patients with cancer we explored the

accuracy of ethnicity information in HES data. We found that overall the level of discordance

of ethnicity information between HES records and patient self-reports is low, particularly for

the majority White British patients. There is however a substantial degree of inaccuracy in

the recorded ethnicity of patients who self-report themselves as belonging to ethnic

minority groups. For many major ethnic groups (‘Indian’, ‘Pakistani’, ‘Bangladeshi’, ‘Chinese’,

‘Black-Caribbean’ and ‘Black African’) routine hospital data will mis-code between 20 and

35% of all patients who self-report that they belong to these ethnic groups (sensitivity 65% -

80%). Further, up to 20% of patients recorded as belonging to some major ethnic groups will

self-report that they actually belong to other ethnic groups (PPV 80%- 89% respectively). For

patients who self-report being of mixed ethnic groups HES records are usually discordant.

We provide probability tables that can be used in re-estimating ethnic group variations when

investigating ethnic variation in processes or quality of care from UK hospital records in

order to improve precision of such estimates.

The study explored a unique opportunity provided by patient survey data to explore the

accuracy of ethnicity information in UK routine health care data. Previous evidence on the

accuracy of ethnicity information in administrative datasets relates to US settings 18–20

. Our

findings concord with previous US literature which also indicates that the accuracy of

ethnicity information tends to be highest for the majority white population, lower for major

ethnic minority groups (like African American or Hispanic) and lowest for smaller ethnic

groups (such as American Indian/Alaskan Natives) and Mixed or Other racial/ethnic groups 18

19. Other strengths of the study include its large sample size, enabling the profiling of

discordance for small ethnic group; and the use of regression analyses to explore

Page 14 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 47: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

15

independent predictors of discordance and to examine variation between different

hospitals.

A limitation is that the study population included patients who attended hospital for cancer

treatment (most of whom are aged 65 years of age or older). Therefore in principle the

generalisability of the findings (particularly regarding younger patients) might be limited,

equally the need for accurate recording of ethnicity among cancer patients has been

identified 23

. We were also unable to explore potential inaccuracies in ethnicity

categorisation resulting from longitudinal person-level discordance among patients with

more than one hospital care episodes. Previous research indicates that up to 3% of patients

with more than one episode of care have longitudinally discordant ethnic group information

24. Evolving socio-cultural trends, or changes in Census methodology (such as the

introduction of Mixed ethnic groups to the UK census in 2001) could contribute to changes

in a person’s self-identified ethnic group over their lifetime. Lastly, we were not able to

account for the process by which ethnicity was ascertained in hospital records. It is likely

that this process involves a combination of self-reports, reports by relatives or carers (e.g. in

the case of infirm patients, or patients with language or other communication difficulties) or

even guess work by hospital staff (e.g. when there are language barriers, or in clinical

emergencies) 16

.

The 2001 Census has expanded the previous ONS classification of ethnicity (originally

containing 10-groups) to the 16-group classification also used in this survey 2, a change

which was subsequently reflected into HES coding. The impact of this secular change may

explain some of the discordance observed. For example, the discordance in self-reported

Irish White ethnicity may reflect the fact that this ethnic group was only included in the 2001

classification. The 2011 Census included two new ethnic groups; ‘Gypsy or Irish Traveller’

Page 15 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 48: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

16

and ‘Arab’25

. It remains to be seen if this change will be reflected into the routinely collected

health data.

Whilst this study considers the accuracy of recorded ethnicity, there are issues with defining

ethnicity using any simple classification, for example the potential for ‘concealed

heterogeneity’ within each of the ethnic groups 16

. It is possible for example that within the

Indian ethnic group certain social, linguistic or religious subgroups which are more or less

likely to have a discordant ethnicity. Indeed evidence indicates that ethnic minority patients

with discordant ethnicity information may be systematically different to other patients from

the same ethnic group with concordant ethnicity 20

. Within-ethnic group heterogeneity is

complex issue inherent in any type of ethnicity research, and the nature of our study did not

allow us to address this.

Another issue we have not addressed in this paper is the completeness of ethnicity

information, which has the potential to bias estimates of ethnic variation in addition to the

misclassification detailed here. We performed similar analysis to those discussed here for

predictors of missing ethnicity (not shown). We found only small variations by self-reported

ethnic group but a much larger degree of variation between hospitals than that seen for

discordance implying the issues with missing ethnicity is primarily driven by hospital level

processes.

Our findings have implications for policy and future research. First, although currently HES

data have a high level of completeness of ethnic information, and if the aim of any audit is to

compare outcomes between White and non-White groups the current classification system

will perform well but a degree of caution is required when interpreting more detailed

evidence on ethnic inequalities in care quality or disease incidence and prevalence that is

based solely on HES data, particularly for minority ethnic groups found to have higher

discordance rates. Misclassification of ethnicity in HES data could result in either under-

Page 16 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 49: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

17

estimation of ethnic variation, or inability to detect such variation when it exists (‘type 1’

error).

Second, we provide a list of probabilities that can be used to improve estimates of ethnic

variation in health care in a UK setting 14 22

. These probabilities can be used both to improve

the precision of prevalence estimates and in statistical models where hospital record ethnic

group is a predictor 14

.

Third, as the completeness of ethnicity information in hospital records is currently high,

more attention needs to be given to the accuracy of recorded information. The substantial

variation between hospitals in the accuracy of ethnicity information indicates that there is

great potential for improving the quality of ethnicity information in poorly performing

hospitals. Improvement in the quality of HES data (which is generally desirable 26

) should

also encompass improvements in the quality of ethnicity coding. Qualitative studies have

found willingness among ethnic minority groups to provide this information 27

and future

research should explore optimal ways for efficiently obtaining current self-reported

information on ethnicity in patient records.

Page 17 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 50: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

18

Contributors: All authors had substantial input into the study concept, design, analysis and

access to the data. GL had the original idea for the study which was further conceptualised in

discussions between all authors; methods were principally developed by CLS and further

amplified by GAA. All authors communicated frequently during the course of the project,

including through face-to-face meetings. CLS and GL wrote the first draft which was edited by

all authors over multiple versions. All authors saw and approved the final manuscript.

Competing interests: None.

Acknowledgement: We thank the UK Data Archive for access to the anonymous survey data

(UKDA study number: 69488), the Department of Health as the depositor and principal

investigator of the Cancer Patient Experience Survey 2010, Quality Health as data collector;

and all NHS Acute Trusts in England, for provision of data samples.

Ethics approval: Not required.

Funding statement: The paper is independent research arising from a Post-Doctoral

Fellowship award to GL supported by the National Institute for Health Research (PDF-2011-04-

047). The views expressed in this publication are those of the authors and not necessarily

those of the NHS, the National Institute for Health Research or the Department of Health.

Data sharing: Not applicable (the data are available for academic research from the UK Data

Archive – see acknowledgement).

STROBE: Checklist for an observational study: Attached separately.

Page 18 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 51: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

19

REFERENCES

1 US Department of Health and Human Services Agency for Healthcare Research and Quality. 2009 National

Healthcare Disparities and Quality Reports. 2009. http://www.ahrq.gov/qual/qrdr09.htm (accessed 29

Nov2012).

2 Department of Health. Equity and excellence: liberating the NHS (White Paper). 2010.

http://www.dh.gov.uk/en/Publicationsandstatistics/Publications/PublicationsPolicyAndGuidance/DH_117353

(accessed 29 Nov2012).

3 Cancer Incidence and Survival By Major Ethnic Group, England, 2002 - 2006. 2009.

http://www.ncin.org.uk/publications/reports/default.aspx

4 Mindell J, Klodawski E, Fitzpatrick J. Using routine data to measure ethnic differentials in access to coronary

revascularization. J Public Health 2008;30:45–53.

5 Mir G, Salway S, Kai J, et al. Principles for research on ethnicity and health: the Leeds Consensus Statement.

Eur J Public Health 2012;:cks028–.

6 The NHS Information Centre. How good is HES ethnic coding and where do the problems lie? 2011.

http://www.hesonline.nhs.uk/Ease/servlet/ContentServer?siteID=1937&categoryID=171 (accessed 7

Feb2013).

7 Szczepura A. Access to health care for ethnic minority populations. Postgrad Med J 2005;81:141–7.

8 Jack RH, Linklater KM, Hofman D, et al. Ethnicity coding in a regional cancer registry and in Hospital Episode

Statistics. BMC Public Health 2006;6:281.

9 Isajiw WW. Definition and Dimensions of Ethnicity: a theoretical framework. In: Challenges of Measuring an

Ethnic World: Science, politics and reality: Proceedings of the Joint Canada-United States Conference on the

Measurement of Ethnicity April 1-3, 1992,. Washington, D.C.: : U.S. Government Printing Office 407–27.

10 Lyratzopoulos G, McElduff P, Heller RF, et al. Comparative levels and time trends in blood pressure, total

cholesterol, body mass index and smoking among Caucasian and South-Asian participants of a UK primary-

care based cardiovascular risk factor screening programme. BMC Public Health 2005;5:125.

11 Shah BR, Chiu M, Amin S, et al. Surname lists to identify South Asian and Chinese ethnicity from secondary

data in Ontario, Canada: a validation study. BMC Med Res Methodol 2010;10:42.

12 Maringe C, Mangtani P, Rachet B, et al. Cancer incidence in South Asian migrants to England, 1986-2004:

Unraveling ethnic from socioeconomic differentials. Int J Cancer 2013;132:1886–94.

13 Quan H, Wang F, Schopflocher D, et al. Development and validation of a surname list to define Chinese

ethnicity. MedCare 2006;44:328–33.

14 Elliott MN, Fremont A, Morrison PA, et al. A new method for estimating race/ethnicity and associated

disparities where administrative records lack self-reported race/ethnicity. Health Serv Res 2008;43:1722–36.

15 Nitsch D, Kadalayil L, Mangtani P, et al. Validation and utility of a computerized South Asian names and group

recognition algorithm in ascertaining South Asian ethnicity in the national renal registry. QJM  2009;102:865–

72.

Page 19 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 52: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

20

16 Aspinall PJ. The utility and validity for public health of ethnicity categorization in the 1991, 2001 and 2011

British Censuses. Public Health 2011;125:680–7.

17 Mays VM, Ponce NA, Washington DL, et al. Classification of race and ethnicity: implications for public health.

Annu Rev Public Health 2003;24:83–110.

18 Arday SL, Arday DR, Monroe S, et al. HCFA’s racial and ethnic data: current accuracy and recent

improvements. Health Care Financ Rev 2000;21:107–16.

19 Lee LM, Lehman JS, Bindman AB, et al. Validation of race/ethnicity and transmission mode in the US HIV/AIDS

reporting system. Am JPublicHhealth 2003;93:914–7.

20 Zaslavsky AM, Ayanian JZ, Zaborski LB. The validity of race and ethnicity in enrollment data for Medicare

beneficiaries. Health Serv Res 2012;47:1300–21.

21 Department of Health. National Cancer Patient Experience Survey 2010 [computer file]. UKDA-SN-6742-1 .

Colchester, Essex: UK Data Archive [distributor]. http://dx.doi.org/10.5255/UKDA-SN-6742-1

22 McCaffrey DF, Elliott MN. Power of tests for a dichotomous independent variable measured with error.

Health Serv Res 2008;43:1085–101.

23 Iqbal G, Gumber A, Szczepura A, et al. Improving ethnic data collection for statistics of cancer incidence,

management, mortality and survival in the UK .

http://www2.warwick.ac.uk/fac/med/research/csri/ethnicityhealth/research/crc.pdf (accessed 25 Apr2013).

24 Georghiou T, Thorlby R. Quality of ethnicity coding in Hospital Episode Statistics: beyond completeness. Paper

presented at Healthcare Commission seminar “Measuring what matters: measuring ethnicity in health data”

12/02/2007 . 2007.

25 Office of National Statistics. 2011 Census. http://www.ons.gov.uk/ons/guide-

method/census/2011/index.html (accessed 8 Mar2013).

26 Spencer SA, Davies MP. Hospital episode statistics: improving the quality and value of hospital data: a

national internet e-survey of hospital consultants. BMJ Open 2012;2. doi:10.1136/bmjopen-2012-001651

27 Iqbal G, Johnson MR, Szczepura A, et al. UK ethnicity data collection for healthcare statistics: the South Asian

perspective. BMC Public Health 2012;12:243.

Page 20 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 53: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

21

Page 21 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 54: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

22

Figure legends:

Figure 1. Survey responders and exclusions

Figure 2. Sensitivity* and positive predictive value** of hospital record recorded ethnicity compared with self-

reported ethnicity as a gold-standard

*If a patient self-reports that they belong to a particular ethnic group, then the sensitivity of the hospital record ethnicity coding

is the probability that the hospital record will record the same (correct) ethnicity

**If a patient’s hospital record states that they belong to a particular ethnic group, then the positive predictive value of the

hospital record ethnicity code is the probability that this code has been recorded correctly and that the patient will self-report

the same ethnicity

Page 22 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 55: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

23

Total

respondents

Total discordant Crude OR of

discordance

joint p-

value

Adjusted OR joint p-

value

n % n %

Age

16-24 399 0.7 30 7.5 1.77 (1.28-2.46) <0.0001 0.9 (0.55-1.48) 0.54

25-34 950 1.6 93 9.8 2.37 (1.86-3.01) 0.94 (0.71-1.26)

35-44 3,139 5.3 231 7.4 1.73 (1.51-1.99) 1.02 (0.84-1.23)

45-54 7,763 13.2 484 6.2 1.45 (1.3-1.61) 1.12 (0.97-1.29)

55-64 15,375 26.2 743 4.8 1.11 (0.99-1.23) 1.11 (0.98-1.25)

65-74 18,366 31.3 805 4.4 reference reference

75-84 10,747 18.3 412 3.8 0.87 (0.76-0.99) 0.99 (0.86-1.14)

85+ 1,982 3.4 80 4.0 0.92 (0.71-1.18) 1.08 (0.83-1.42)

Gender

Men 27,395 46.7 1,268 4.6 reference 0.014 reference 0.69

Women 31,326 53.3 1,610 5.1 1.12 (1.02-1.22) 1.02 (0.93-1.12)

Deprivation

Lowest 13,301 22.7 488 3.7 reference <0.0001 reference 0.69

2 13,432 22.9 509 3.8 1.03 (0.91-1.18) 1.04 (0.9-1.2)

3 12,498 21.3 559 4.5 1.23 (1.05-1.44) 0.99 (0.85-1.14)

4 10,756 18.3 652 6.1 1.69 (1.36-2.11) 1.03 (0.89-1.2)

Highest 8,734 14.9 670 7.7 2.18 (1.68-2.84) 0.94 (0.8-1.11)

Ethnic Group

British (White) 54,589 93.0 1,177 2.2 reference <0.0001 reference <0.0001

Irish (White) 938 1.6 490 52.2 49.63 (32.72-75.28) 47.73 (41-55.55)

Any other White background 1,066 1.8 485 45.5 37.88 (24.95-57.52) 36.49 (31.52-42.25)

White and Black Caribbean

(Mixed)

72 0.1 50 69.4 103.14 (55.73-190.86) 109.97 (64.65-187.04)

White and Black African (Mixed) 41 0.1 33 80.5 187.19 (90.23-388.35) 202.59 (90.28-454.64)

White and Asian (Mixed) 77 0.1 65 84.4 245.81 (125.89-

479.94)

283.62 (149.43-

538.33)

Any other Mixed background 49 0.1 43 87.8 325.22 (124.17-

851.84)

399.68 (166.26-

960.79)

Indian (Asian or Asian British) 516 0.9 101 19.6 11.04 (7.16-17.04) 8.06 (6.33-10.27)

Pakistani (Asian or Asian British) 206 0.4 46 22.3 13.05 (8.39-20.28) 10.26 (7.19-14.66)

Bangladeshi (Asian or Asian

British)

50 0.1 13 26.0 15.94 (7.44-34.18) 12.02 (6.17-23.41)

Any other Asian background 129 0.2 57 44.2 35.93 (20.28-63.63) 30.42 (20.94-44.2)

Caribbean (Black or Black British) 471 0.8 136 28.9 18.42 (12.64-26.85) 10.37 (8.21-13.1)

African (Black or Black British) 319 0.5 111 34.8 24.22 (16.97-34.56) 15.54 (11.9-20.28)

Any other Black background 10 0.0 9 90.0 408.42 (49.3-3383.53) 410.23 (49.73-

3384.32)

Chinese (other ethnic group) 124 0.2 30 24.2 14.48 (8.77-23.92) 10.66 (6.86-16.57)

Any other ethnic group 64 0.1 32 50.0 45.38 (27.55-74.75) 34.04 (20.05-57.79)

Hospital

OR 95% Reference Range 158 - 30.48 <0.0001 12.85 <0.0001

Table 1. Crude and adjusted predictors of discordant hospital record ethnicity coding

Page 23 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 56: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

24

Total %

N discordant

(ONS6) %

Crude OR of

discordance

joint p-

value

Adjusted OR of

discordance

joint p-

value

Age

16-24 399 0.7 17 4.3 4.15 (2.47-6.96)

<0.0001

0.87 (0.43-1.79)

0.10

25-34 950 1.6 37 3.9 3.78 (2.63-5.42) 1.23 (0.77-1.94)

35-44 3,139 5.3 83 2.6 2.53 (1.99-3.22) 1.32 (0.96-1.82)

45-54 7,763 13.2 166 2.1 2.04 (1.68-2.46) 1.26 (0.98-1.62)

55-64 15,375 26.2 200 1.3 1.23 (1.05-1.44) 1.16 (0.93-1.46)

65-74 18,366 31.3 195 1.1 reference reference

75-84 10,747 18.3 86 0.8 0.75 (0.58-0.98) 0.84 (0.63-1.12)

85+ 1,982 3.4 12 0.6 0.57 (0.3-1.09) 0.73 (0.39-1.39)

Gender

Men 27,395 46.7 321 1.2 reference 0.0003

reference 0.49

Women 31,326 53.3 475 1.5 1.3 (1.13-1.49) 1.06 (0.9-1.26)

Deprivation

Lowest 13,301 22.7 119 0.9 reference

<0.0001

reference

0.30

2 13,432 22.9 120 0.9 1 (0.75-1.34) 1.13 (0.84-1.52)

3 12,498 21.3 143 1.1 1.28 (0.92-1.78) 1.23 (0.92-1.64)

4 10,756 18.3 191 1.8 2 (1.44-2.79) 1.36 (1.02-1.8)

Highest 8,734 14.9 223 2.6 2.9 (2.14-3.93) 1.27 (0.94-1.7)

Ethnic Group

White 56,593 96.4 332 0.6 baseline

<0.0001

baseline

<0.0001

Mixed 239 0.4 179 74.9 505.56 (333.91-765.45) 564.17 (399.47-796.78)

Asian 901 1.5 100 11.1 21.16 (14.5-30.88) 14.3 (10.97-18.65)

Black 800 1.4 123 15.4 30.79 (19.65-48.23) 17.51 (13.35-22.96)

Chinese 124 0.2 30 24.2 54.08 (32.35-90.42) 38.62 (24.39-61.14)

Other 64 0.1 32 50.0 169.46 (103.63-277.12) 109.64 (63.49-189.34)

Hospital

OR 95% Reference

Range 158 - 73.97 <0.0001 22.96 <0.0001

Table 2. Crude and adjusted predictors of discordant hospital record ethnicity coding (ONS 6-group classification)

Page 24 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 57: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

25

Page 25 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 58: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

26

Probability (%) of true self-reported ethnic group for each HES code

HES code British

(White)

Irish

(White)

Any other

White

background

White and

Black

Caribbean

(Mixed)

White and

Black

African

(Mixed)

White

and Asian

(Mixed)

Any other

Mixed

background

Indian

(Asian or

Asian

British)

Pakistani

(Asian or

Asian

British)

Bangladeshi

(Asian or Asian

British)

Any other

Asian

background

Caribbean

(Black or

Black British)

African

(Black or

Black

British)

Any other

Black

background

Chinese

(other

ethnic

group)

Any

other

ethnic

group

A = British (White) 98.1 0.9 0.7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.0 0.0

B = Irish (White) 20.1 78.7 1.0 0.0 0.0 0.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

C = Any other White background 53.1 1.2 42.1 0.2 0.1 0.5 0.7 0.2 0.0 0.0 0.5 0.1 0.3 0.1 0.2 0.8

D = White and Black Caribbean (Mixed) 11.2 0.0 2.8 38.3 0.0 0.0 2.8 0.0 0.0 0.0 0.0 41.1 2.8 0.9 0.0 0.0

E = White and Black African (Mixed) 12.2 0.0 6.1 4.1 26.5 0.0 8.2 0.0 0.0 0.0 0.0 8.2 34.7 0.0 0.0 0.0

F = White and Asian (Mixed) 25.0 0.0 2.8 0.0 0.0 40.3 2.8 8.3 1.4 1.4 13.9 1.4 0.0 0.0 1.4 1.4

G = Any other Mixed background 30.6 0.9 10.8 4.5 4.5 7.2 12.6 4.5 0.9 0.9 3.6 5.4 2.7 0.9 4.5 5.4

H = Indian (Asian or Asian British) 1.9 0.0 0.4 0.0 0.1 0.6 0.1 88.4 2.8 0.4 4.6 0.2 0.1 0.1 0.0 0.1

J = Pakistani (Asian or Asian British) 2.3 0.0 0.8 0.0 0.0 0.5 0.3 5.9 87.7 1.0 1.0 0.0 0.0 0.0 0.0 0.5

K = Bangladeshi (Asian or Asian British) 2.1 0.0 0.0 0.0 0.0 1.1 1.1 5.3 2.1 82.1 4.2 0.0 1.1 0.0 1.1 0.0

L = Any other Asian background 4.4 0.0 4.2 0.2 0.5 3.7 3.0 21.5 7.6 0.9 41.0 0.5 0.2 0.7 3.5 8.1

M = Caribbean (Black or Black British) 4.2 0.0 0.1 3.6 0.5 0.0 0.6 0.0 0.0 0.0 0.4 84.8 3.6 2.0 0.0 0.1

N = African (Black or Black British) 5.0 0.2 1.4 0.6 2.6 0.0 0.6 0.0 0.0 0.0 0.2 6.6 81.3 1.4 0.0 0.2

P = Any other Black background 3.1 0.0 0.9 3.6 2.7 0.9 1.8 0.4 0.0 0.0 1.3 38.1 42.6 4.0 0.4 0.0

R = Chinese (other ethnic group) 4.2 0.0 0.8 0.0 0.0 0.8 0.0 0.4 0.0 0.0 4.2 0.0 0.0 0.0 83.2 6.3

S = Any other ethnic group 43.5 0.6 17.3 0.4 0.9 1.6 1.8 3.5 1.1 0.6 9.1 2.5 3.4 0.9 1.6 11.3

Z = Not stated 90.6 1.5 2.8 0.1 0.1 0.3 0.2 0.9 0.5 0.0 0.7 1.0 0.6 0.1 0.3 0.3

X or Missing 91.3 1.3 2.5 0.1 0.1 0.2 0.2 1.3 0.4 0.1 0.5 0.7 0.7 0.1 0.4 0.2

Table 3. The probability (%) of ‘true’ ethnic group for each hospital record-recorded ethnic group (i.e. the probability that a given person with a specific HES-recorded

ethnicity belongs to any (self-reported) group

Page 26 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. Protected by copyright. http://bmjopen.bmj.com/ BMJ Open: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. Downloaded from

Page 59: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

27

Page 27 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 60: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

1

Accuracy of routinely recorded ethnic group information

compared with self-reported ethnicity: Evidence from the

English Cancer Patient Experience survey

Saunders CL (1)

Abel GA (1)

El Turabi A (1)

Ahmed F (1)

Lyratzopoulos G (1)

Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public

Health, Forvie Site, Robinson Way, Cambridge, CB2 0SR

Author for correspondence:

Catherine Saunders

Mail to: [email protected]

Word count: 295

Abstract: 335 words

Total main text: 2,621,915

Page 28 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 61: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

2

ABSTRACT

Objective: To describe the accuracy of ethnicity coding in contemporary NHS hospital

records compared to the ‘gold standard’ of self-reported ethnicity.

Design: Secondary analysis of data from a cross-sectional survey (2011)

Setting: All NHS hospitals in England providing cancer treatment

Participants: 58721 patients with cancer for whom ethnicity information [Office for

National Statistics (ONS) 2001 16-group classification] was available from self-reports

(considered to represent the ‘gold standard’) and their hospital record.

Methods: We calculated the sensitivity and positive predictive value (PPV) of hospital record

ethnicity. Further we used a logistic regression model to explore independent predictors of

discordance between recorded and self-reported ethnicity.

Results: Overall, 4.9% (4.7-5.1%) of people had their self-reported ethnic group incorrectly

recorded in their hospital records. Recorded White British ethnicity had high sensitivity

(97.8% (97.7-98.0%)) and PPV (98.1% (98.0-98.2%)) for self-reported White British ethnicity.

Recorded ethnicity information for the 15 other ethnic groups was substantially less

accurate with 41.2% (39.7-42.7%) incorrect. Recorded ‘Mixed’ ethnicity had low sensitivity

(12%-31%) and PPVs (12%-42%). Recorded ‘Indian’, ‘Chinese’, ‘Black-Caribbean’ and ‘Black

African’ ethnic groups had intermediate levels of sensitivity (65%-80%) and PPV (80%-89%

respectively). In multivariable analysis, belonging to an ethnic minority group was the only

independent predictor of discordant ethnicity information. There was strong evidence that

the degree of discordance of ethnicity information varied substantially between different

hospitals (p<0.0001).

Discussion: Current levels of accuracy of ethnicity information in NHS hospital records

support valid profiling of White / non-White ethnic differences. However profiling of ethnic

Page 29 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 62: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

3

differences in process or outcome measures for specific minority groups may contain a

substantial and variable degree of misclassification error. These considerations should be

taken into account when interpreting ethnic variation audits based on routine data and

inform initiatives aimed at improving the accuracy of ethnicity information in hospital

records.

Page 30 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 63: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

4

ARTICLE SUMMARY

Article focus

• Accurate recording of ethnicity in administrative health data is a prerequisite for efforts to

ensure equality or reduce inequalities in health care.

• This paper describes the accuracy of ethnicity coding in English NHS hospital records

compared to the ‘gold standard’ of self-reported ethnicity, and identifies areas where

improvement is needed.

Key messages

• Hospital records will usually code the ethnicity of patients with self-reported White British

ethnic group correctly, but levels of incorrect coding of the ethnicity of all ethnic minority

groups are high

• Belonging to an ethnic minority group is the only independent predictor of having an

incorrectly coded hospital record ethnicity; there is substantial variation in the quality of

coding between hospitals

• Probability look-up tables provided in this paper can be used for weighting of incidence or

prevalence estimates where hospital record ethnicity is being used, or in regression

analysis to improve estimation of ethnic variation in processes or outcomes of care

Strengths and limitations

• This was a unique opportunity to carry out an audit of the accuracy of hospital record

coding of ethnicity in England, using data from a large national survey of recently treated

patients.

Page 31 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 64: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

5

• We were not able to account for different processes by which ethnicity was

ascertained in hospital records and the study population was skewed towards older

ages which may limit the generalisability of the findings.

Page 32 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 65: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

6

BACKGROUND

Modern health care systems aspire to equal access and quality of care for patients of any

ethnic group 1 2

. In England, there are nevertheless well-documented ethnic inequalities in

processes and experiences of care 3 4

. Good practice in measuring these inequalities is

needed 5. Therefore, the availability of complete and valid information on patients’ ethnicity

in routine NHS data is a fundamental first step to enable equality audits to inform

improvement actions. Further, there are variations in the incidence and prevalence of some

conditions between different patient ethnic groups. Understanding such variations can be

vital for service planning and required capacity estimates. National Health Service hospitals

began to routinely record ethnicity information in their Patient Administration System (PAS)

records in the mid-1990s 6. (Patient Administration System records are a pre-cursor to

Hospital Episode Statistics (HES) data – hereafter we refer to hospital records as ‘HES

records’ for simplicity, as this is the more commonly used convention to denote patient

administrative data in the NHS). Although implementation of this measure has been slow 7,

HES data in recent years have high completeness of ethnic group information (typically

exceeding 90%) 6 8

. There is however little evidence about the accuracy of routinely recorded

information on patients’ ethnicity. In the past, audits of the quality of ethnicity information

in NHS records have principally focused on data completeness, rather than accuracy 6 and

completeness of ethnicity coding information currently forms part of the Commissioning

Outcomes Data Set 6, a prescribed standard for data quality that is linked to hospital

reimbursement.

Page 33 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 66: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

7

Ethnicity can be classified by referring to a community of people who share the same

culture, and/or by referring to an ancestral population which comprises their self-identity 9.

Self-reported ethnicity captures both the shared experiences/culture of an individual and

their self-identity. Methods such as surname recognition algorithms or geo-coding (or

combinations of both these approaches) have been developed to indirectly infer the

ethnicity of individuals10–15

, but both have limitations that do not apply to self-reported

ethnicity. For example, geo-coding methods rely on high levels of geographical (residential)

segregation between different ethnic groups. Surname recognition methods rely on low

levels of inter-communalethnic marriages and the existence of distinctive surname

nomenclatures (which do not always exist for some ethnic groups, e.g. Black populations in

the United States). Further, by definition indirect methods will misclassify the ethnicity of

some patients, and disproportionately do so for the ethnicity of people from ethnic

minorities. For all the above reasons self-report is currently considered the gold standard

measure of ethnicity 16 17

.

Patient surveys, when linked to routine ethnicity data, provide opportunities to determine

the accuracy of ethnic group information contained in routine health system records. As this

linkage is rarely done, few studies have cross-examined the accuracy of ethnic group

information included in routine healthcare records against self-reported ethnicity

information. Those that have examined this have been conducted in the USA 18–20

.

Against this background and using data from an English national patient survey we aimed to

examine the overall accuracy of recorded versus self-reported ethnicity; and whether

discordance between recorded and self-reported ethnicity information varied between

patients of different ethnic groups.

Page 34 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 67: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

8

METHODS

Data

We used publicly available anonymous data from the 2010 Cancer Patient Experience Survey

(CPES) in England 21

. An unusual feature of this survey is that hospital recorded ethnicity was

collected when compiling the sample and linked to patient responses by the survey provider.

All patients treated for cancer in an English NHS hospital during the first quarter of 2010

were invited to participate in the survey, with a response rate of 67% 21

. Of 67713

respondents, 64418 (94.7%) provided valid self-reported ethnicity information. Because self-

reported ethnicity was used as the gold-standard against which the accuracy of HES-

recorded ethnicity was compared, data on respondents with missing self-reported ethnicity

were excluded from further analysis (see figure 1). Data on an additional 348 respondents

with missing information on deprivation status were also excluded from further analysis,

leaving 63,770 respondents. An additional 5049 records with missing HES-recorded ethnicity

were excluded, leaving 58721 records for analysis. For all respondents included in our

analyses completely observed data were available for patients’ HES-recorded age, gender

and socio-economic status (using Index of Multiple Deprivation (IMD) score of lower super

output area of residence). In both HES-records and patient reports, ethnicity was classified

using the same 16-group categorisation [Office of National Statistics (ONS16) 2001]

(Appendix 1).

Analysis

We first described the overall degree of discordance between HES-recorded and self-

reported information on ethnicity using the Cohen’s Kappa statistic. We further calculated

the sensitivity and the positive predictive value (PPV) of HES-recorded ethnicity in respect of

self-reported ethnicity. In this context, sensitivity denotes the proportion of patients with a

Page 35 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 68: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

9

given self-reported ethnicity with concordant ethnicity information in their HES record; and

PPV denotes the probability that a patient with a particular HES-recorded ethnic group will

self-report the same ethnic group.

Subsequently, we explored independent predictors of discordant ethnicity information by

constructing a multivariable logistic regression model with concordant/discordant ethnicity

status as the binary outcome variable and self-reported ethnicity as a covariate. Adjustment

was also made for other patient socio-demographics characteristics (age in 10 year age

groups, gender and postcode-linked area-based deprivationsocio-economic status). For age

and gender we used HES-recorded information because the completeness of these variables

among survey respondents was higher than self-reported age and gender, and as the degree

of concordance between HES-recorded and self-reported age (based on year of birth) and

gender were very high (>99.5% for both).

To explore whether clustering of patients in some groups in hospitals with higher or lower

discordance levels could in part explain the findings, we subsequently repeated the

regression model described above including a random effect for hospital – in addition, this

model allows us to explore the degree of variation in the level of ethnicity information

discordance between different hospitals. Because less detailed classifications of ethnicity are

often used in health research, we also carried out supplementary analysis looking at

discordance when using 6 (as opposed to 16) ethnic groups (i.e. White, Mixed, Asian or Asian

British, Black or Black British, Chinese and Other).

Lastly, we constructed a probability ‘look-up’ table indicating the probability that HES-

recorded ethnicity represents true (self-reported) ethnicity for each of the 16 different

ethnic groups. These probabilities can be used for weighting of incidence or prevalence

estimates or used in regression analysis to improve estimation of ethnic variation in

processes or outcomes of care 14 22

. For the calculation of the probability `look-up’ table only,

Formatted: Font: Not Bold, Not Italic

Page 36 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 69: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

10

we combined data from two surveys (2010 and 2011/12) to improve the precision of the

probabilities presented (increasing our sample size to 133,204). STATA 11 was used for all

analyses.

Page 37 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 70: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

11

RESULTS

Overall the frequency of discordance of HES-recorded and self-reported ethnicity

information was 4.9% (4.7-5.1%). Patients who identified themselves as White British had

the lowest frequency of discordant ethnicity information in their HES records (2.2% (2.0-

2.3%)). In contrast, patients who identified themselves as belonging to any other ethnic

group had substantially higher frequency of discordant HES-recorded ethnicity (41.2% (39.7-

42.7%)). Cohen’s Kappa for HES-recorded and self-reported ethnicity was 0.64 overall and

0.54 if White British patients were excluded. The frequency of discordance was particularly

high for patients who self-reported that they belonged to the Any Other Black Background

(90% (97.4-100%)) and the Any Other Mixed Background (87.8% (78.2-97.3%)) groups (Table

1).

HES-recorded ‘White British’ ethnicity had high sensitivity 97.8 (97.7-98) and PPV 98.1 (98-

98.2) for self-reported White British ethnicity (Figure 2, estimates and confidence intervals in

appendix table 2 for the 6 and 16 group classification). In contrast, HES-recorded ‘Mixed’

ethnicity had very low sensitivities (12%-31%) and PPVs (12%-42%). HES-recorded ‘Indian’,

‘Pakistani’, ‘Bangladeshi’, ‘Chinese’, ‘Black-Caribbean’ and ‘Black African’ ethnicity had

intermediate levels of sensitivity (65% -80%) and PPV (80%- 89% respectively). HES-recorded

‘White Irish’ ethnicity had low sensitivity ( 47.8% (44.5-51.0%)) but high PPV (81.5% (77.9-

84.6%)). This means that of all individuals who self-identify themselves as White Irish only

48% would have their ethnic group recorded as such in their HES records; however, among

patients whose HES records indicate that they are ‘White Irish’, 82% would identify

themselves to belong to this group too.

Whilst crudely there was some evidence that age, gender and deprivation are associated

with discordance of ethnicity information, in multivariable logistic regression analysis

Page 38 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 71: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

12

adjusting for other patient characteristics the sole independent predictor of discordance was

self-reported ethnicity (Table 1). Repeating this model with a random effect for hospital

produced only trivial differences to associations of discordance with patient characteristics.

This means that the association between discordance and self-reported ethnic minority

group cannot be explained by ethnic minority patients attending hospitals that have overall

poor levels of accuracy of ethnicity information. There was however strong evidence

(p<0.0001) of variation in discordance between different hospitals (accuracy of coding of

ethnicity across hospitals ranged from 67-100%). Specifically, if all hospitals were to be

arranged in order of frequency of discordance of ethnicity information, among their HES

records,and after accounting for differences in the proportion of ethnic minority patients

attending the hospital, the odds ratio of discordance between the hospital in the 97.5th

and

2.5th

centiles (the 95% reference range) will be about 13, indicating a 13-fold difference in

the odds of discordance across hospitals. After accounting for differences in the proportion

of ethnic minority patients attending each hospital the hospital at the bottom 2.5th

centile of

coding accuracy had concordant self-reported and hospital record recorded ethnicity codes

in 90% of records.

Similar findings to those observed in the main analysis were observed when using a 6-group

classification (Appendix 2table 2) indicating that a large degree of discordance is between

rather than within the 6-groups used in the cruder classification.The total numbers of

respondents with discordant ethnicity decreased when using a 6-group classification to 796

subjects (1.4%) from 2878 (4.9%), indicating that much of the discordance was within the

cruder 6 group classification. However self-reporting a non-White ethnic background

remained as strongly associated with having an incorrectly coded hospital-record ethnic

group as for the 16 group classification. Details of the ethnic groups to which people from

Formatted: Superscript

Page 39 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 72: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

13

each self-reported ethnic background were misclassified to are given in table 3, showing the

variation between and within the 6 and 16 group coding.

People who self-report mixed ethnic backgrounds are particularly likely to have an

incorrectly coded ethnic group in their hospital records, with numbers high using both the 16

(79.9%) and 6 (74.9%) group classifications. Looking at table 3 we see that the self reported

ethnicity is falls into one of three groups - the concordant mixed category, white (either

British or Other white) or the corresponding ethnic minority category - over 90% of the time,

explaining why for this group the broader 6 category classification does not improve the

accuracy of hospital record ethnicity coding.

Finally, we providethe probability tables (Table 3)(Table 2) that can be used in estimating

ethnic group variations in incidence, prevalence or measures care quality when using HES

data. Examples of such applications in the context of US health research have been reported

previously 14 22

.

Formatted: Font: Not Bold, Not Italic

Formatted: Font: Not Bold, Not Italic

Formatted: Font: Not Bold, Not Italic

Formatted: Font: Not Bold, Not Italic

Page 40 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 73: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

14

DISCUSSION

Using data from a recent national survey of hospital patients with cancer we explored the

accuracy of ethnicity information in HES data. We found that overall the level of discordance

of ethnicity information between HES records and patient self-reports is low, particularly for

the majority White British patients. There is however a substantial degree of inaccuracy in

the recorded ethnicity of patients who self-report themselves as belonging to ethnic

minority groups. For many major ethnic groups (‘Indian’, ‘Pakistani’, ‘Bangladeshi’, ‘Chinese’,

‘Black-Caribbean’ and ‘Black African’) routine hospital data will mis-code between 20 and

35% of all patients who self-report that they belong to these ethnic groups (sensitivity 65% -

80%). Further, up to 20% of patients recorded as belonging to some major ethnic groups will

self-report that they actually belong to other ethnic groups (PPV 80%- 89% respectively). For

patients who self-report being of mixed ethnic groups HES records are usually discordant.

We provide probability tables that can be used in re-estimating ethnic group variations when

investigating ethnic variation in processes or quality of care from UK hospital records in

order to improve precision of such estimates.

The study explored a unique opportunity provided by patient survey data to explore the

accuracy of ethnicity information in UK routine health care data. Previous evidence on the

accuracy of ethnicity information in administrative datasets relates to US settings 18–20

. Our

findings concord with previous US literature which also indicates that the accuracy of

ethnicity information tends to be highest for the majority white population, lower for major

ethnic minority groups (like African American or Hispanic) and lowest for smaller ethnic

groups (such as American Indian/Alaskan Natives) and Mixed or Other racial/ethnic groups 18

19. Other strengths of the study include its large sample size, enabling the profiling of

discordance for small ethnic group; and the use of regression analyses to explore

Page 41 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 74: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

15

independent predictors of discordance and to examine variation between different

hospitals.

A limitation is that the study population included patients who attended hospital for cancer

treatment (most of whom are aged 65 years of age or older). Therefore in principle the

generalisability of the findings (particularly regarding younger patients) might be limited,

equally the need for accurate recording of ethnicity among cancer patients has been

identified 23

. We were also unable to explore potential inaccuracies in ethnicity

categorisation resulting from longitudinal person-level discordance among patients with

more than one hospital care episodes. Previous research indicates that up to 3% of patients

with more than one episode of care have longitudinally discordant ethnic group information

24. Evolving socio-cultural trends, or changes in Census methodology (such as the

introduction of Mixed ethnic groups to the UK census in 2001) could contribute to changes

in a person’s self-identified ethnic group over their lifetime. Lastly, we were not able to

account for the process by which ethnicity was ascertained in hospital records. It is likely

that this process involves a combination of self-reports, reports by relatives or carers (e.g. in

the case of infirm patients, or patients with language or other communication difficulties) or

even guess work by hospital staff (e.g. when there are language barriers, or in clinical

emergencies) 16

.

The 2001 Census has expanded the previous ONS classification of ethnicity (originally

containing 10-groups) to the 16-group classification also used in this survey 2, a change

which was subsequently reflected into HES coding. The impact of this secular change may

explain some of the discordance observed. For example, the discordance in self-reported

Irish White ethnicity may reflect the fact that this ethnic group was only included in the 2001

classification. The 2011 Census included two new ethnic groups; ‘Gypsy or Irish Traveller’

Page 42 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 75: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

16

and ‘Arab’25

. It remains to be seen if this change will be reflected into the routinely collected

health data.

Whilst this study considers the accuracy of recorded ethnicity, there are issues with defining

ethnicity using any simple classification, for example the potential for ‘concealed

heterogeneity’ within each of the ethnic groups 16

. It is possible for example that within the

Indian ethnic group certain social, linguistic or religious subgroups which are more or less

likely to have a discordant ethnicity. Indeed evidence indicates that ethnic minority patients

with discordant ethnicity information may be systematically different to other patients from

the same ethnic group with concordant ethnicity 20

. Within-ethnic group heterogeneity is

complex issue inherent in any type of ethnicity research, and the nature of our study did not

allow us to address this.

Another issue we have not addressed in this paper is the completeness of ethnicity

information, which has the potential to bias estimates of ethnic variation in addition to the

misclassification detailed here. We performed similar analysis to those discussed here for

predictors of missing ethnicity (not shown). We found only small variations by self-reported

ethnic group but a much larger degree of variation between hospitals than that seen for

discordance implying the issues with missing ethnicity is primarily driven by hospital level

processes.

Our findings have implications for policy and future research. First, although currently HES

data have a high level of completeness of ethnic information, and if the aim of any audit is to

compare outcomes between White and non-White groups the current classification system

will perform well but a degree of caution is required when interpreting more detailed

evidence on ethnic inequalities in care quality or disease incidence and prevalence that is

based solely on HES data, particularly for minority ethnic groups found to have higher

discordance rates. Misclassification of ethnicity in HES data could result in either under-

Page 43 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 76: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

17

estimation of ethnic variation, or inability to detect such variation when it exists (‘type 1’

error).

Second, we provide a list of probabilities that can be used to improve estimates of ethnic

variation in health care in a UK setting 14 22

. These probabilities can be used both to improve

the precision of prevalence estimates and in statistical models where hospital record ethnic

group is a predictor 14

.

Third, as the completeness of ethnicity information in hospital records is currently high,

more attention needs to be given to the accuracy of recorded information. The substantial

variation between hospitals in the accuracy of ethnicity information indicates that there is

great potential for improving the quality of ethnicity information in poorly performing

hospitals. Improvement in the quality of HES data (which is generally desirable 26

) should

also encompass improvements in the quality of ethnicity coding. Qualitative studies have

found willingness among ethnic minority groups to provide this information 27

F and future

research should explore optimal ways for efficiently obtaining current self-reported

information on ethnicity in patient records.

Page 44 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 77: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

18

Contributors: All authors had substantial input into the study concept, design, analysis and

access to the data. GL had the original idea for the study which was further conceptualised in

discussions between all authors; methods were principally developed by CLS and further

amplified by GAA. All authors communicated frequently during the course of the project,

including through face-to-face meetings. CLS and GL wrote the first draft which was edited by

all authors over multiple versions. All authors saw and approved the final manuscript.

Competing interests: None.

Acknowledgement: We thank the UK Data Archive for access to the anonymous survey data

(UKDA study number: 69488), the Department of Health as the depositor and principal

investigator of the Cancer Patient Experience Survey 2010, Quality Health as data collector;

and all NHS Acute Trusts in England, for provision of data samples.

Ethics approval: Not required.

Funding statement: The paper is independent research arising from a Post-Doctoral

Fellowship award to GL supported by the National Institute for Health Research (PDF-2011-04-

047). The views expressed in this publication are those of the authors and not necessarily

those of the NHS, the National Institute for Health Research or the Department of Health.

Data sharing: Not applicable (the data are available for academic research from the UK Data

Archive – see acknowledgement).

STROBE: Checklist for an observational study: Attached separately.

Page 45 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 78: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

19

REFERENCES

1 US Department of Health and Human Services Agency for Healthcare Research and Quality. 2009 National

Healthcare Disparities and Quality Reports. 2009. http://www.ahrq.gov/qual/qrdr09.htm (accessed 29

Nov2012).

2 Department of Health. Equity and excellence: liberating the NHS (White Paper). 2010.

http://www.dh.gov.uk/en/Publicationsandstatistics/Publications/PublicationsPolicyAndGuidance/DH_117353

(accessed 29 Nov2012).

3 Cancer Incidence and Survival By Major Ethnic Group, England, 2002 - 2006. 2009.

http://www.ncin.org.uk/publications/reports/default.aspx

4 Mindell J, Klodawski E, Fitzpatrick J. Using routine data to measure ethnic differentials in access to coronary

revascularization. Journal of public health (Oxford, England) Public Health 2008;30:45–53.

5 Mir G, Salway S, Kai J, et al. Principles for research on ethnicity and health: the Leeds Consensus Statement.

European journal of public health J Public Health 2012;:cks028–.

6 The NHS Information Centre. How good is HES ethnic coding and where do the problems lie? 2011.

http://www.hesonline.nhs.uk/Ease/servlet/ContentServer?siteID=1937&categoryID=171 (accessed 7

Feb2013).

7 Szczepura A. Access to health care for ethnic minority populations. Postgraduate medical journalPostgrad

Med J 2005;81:141–7.

8 Jack RH, Linklater KM, Hofman D, et al. Ethnicity coding in a regional cancer registry and in Hospital Episode

Statistics. BMC Ppublic Hhealth 2006;6:281.

9 Isajiw WW. Definition and Dimensions of Ethnicity: a theoretical framework. In: Challenges of Measuring an

Ethnic World: Science, politics and reality: Proceedings of the Joint Canada-United States Conference on the

Measurement of Ethnicity April 1-3, 1992,. Washington, D.C.: : U.S. Government Printing Office 407–27.

10 Lyratzopoulos G, McElduff P, Heller RF, et al. Comparative levels and time trends in blood pressure, total

cholesterol, body mass index and smoking among Caucasian and South-Asian participants of a UK primary-

care based cardiovascular risk factor screening programme. BMC pPublic Hhealth 2005;5:125.

11 Shah BR, Chiu M, Amin S, et al. Surname lists to identify South Asian and Chinese ethnicity from secondary

data in Ontario, Canada: a validation study. BMC medical Med rResearch Mmethodology 2010;10:42.

12 Maringe C, Mangtani P, Rachet B, et al. Cancer incidence in South Asian migrants to England, 1986-2004:

Unraveling ethnic from socioeconomic differentials. International journal of cancer Journal international du

cancerInt J Cancer 2013;132:1886–94.

13 Quan H, Wang F, Schopflocher D, et al. Development and validation of a surname list to define Chinese

ethnicity. Medical Ccare 2006;44:328–33.

14 Elliott MN, Fremont A, Morrison PA, et al. A new method for estimating race/ethnicity and associated

disparities where administrative records lack self-reported race/ethnicity. Health services Serv researchRes

2008;43:1722–36.

Page 46 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 79: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

20

15 Nitsch D, Kadalayil L, Mangtani P, et al. Validation and utility of a computerized South Asian names and group

recognition algorithm in ascertaining South Asian ethnicity in the national renal registry. QJM : monthly

journal of the Association of Physicians 2009;102:865–72.

16 Aspinall PJ. The utility and validity for public health of ethnicity categorization in the 1991, 2001 and 2011

British Censuses. Public healthPublic Health 2011;125:680–7.

17 Mays VM, Ponce NA, Washington DL, et al. Classification of race and ethnicity: implications for public health.

Annuual Rreview of pPublic hHealth 2003;24:83–110.

18 Arday SL, Arday DR, Monroe S, et al. HCFA’s racial and ethnic data: current accuracy and recent

improvements. Health Ccare Ffinancing Rreview 2000;21:107–16.

19 Lee LM, Lehman JS, Bindman AB, et al. Validation of race/ethnicity and transmission mode in the US HIV/AIDS

reporting system. American jJournal of pPublic Hhealth 2003;93:914–7.

20 Zaslavsky AM, Ayanian JZ, Zaborski LB. The validity of race and ethnicity in enrollment data for Medicare

beneficiaries. Health Sservices rResearch 2012;47:1300–21.

21 Department of Health. National Cancer Patient Experience Survey 2010 [computer file]. UKDA-SN-6742-1 .

Colchester, Essex: UK Data Archive [distributor]. http://dx.doi.org/10.5255/UKDA-SN-6742-1

22 McCaffrey DF, Elliott MN. Power of tests for a dichotomous independent variable measured with error.

Health Sservices Rresearch 2008;43:1085–101.

23 Iqbal G, Gumber A, Szczepura A, et al. Improving ethnic data collection for statistics of cancer incidence,

management, mortality and survival in the UK .

http://www2.warwick.ac.uk/fac/med/research/csri/ethnicityhealth/research/crc.pdf (accessed 25 Apr2013).

24 Georghiou T, Thorlby R. Quality of ethnicity coding in Hospital Episode Statistics: beyond completeness. Paper

presented at Healthcare Commission seminar “Measuring what matters: measuring ethnicity in health data”

12/02/2007 . 2007.

25 Office of National Statistics. 2011 Census. http://www.ons.gov.uk/ons/guide-

method/census/2011/index.html (accessed 8 Mar2013).

26 Spencer SA, Davies MP. Hospital episode statistics: improving the quality and value of hospital data: a

national internet e-survey of hospital consultants. BMJ Open 2012;2. doi:10.1136/bmjopen-2012-001651

27 Iqbal G, Johnson MR, Szczepura A, et al. UK ethnicity data collection for healthcare statistics: the South Asian

perspective. BMC Ppublic h Health 2012;12:243.

Page 47 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 80: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

21

Page 48 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 81: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

22

Figure 1. Survey responders and exclusions

67713

64118

63770

58721

Excluding patients with missing self-reported

ethnicity

Excluding patients with missing age, gender and

Index of Multiple Deprivation (IMD) score

Excluding patients with missing HES recorded

ethnicity

101773

surveys sent

67% response rate

Page 49 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 82: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

23

Total

respondeants

Total discordant Crude OR of

discordance

joint p-

value

Adjusted OR joint p-

value

n % n %

Age

16-24 399 0.7 30 7.5 1.77 (1.28-2.46) <0.0001 0.9 (0.55-1.48) 0.54

25-34 950 1.6 93 9.8 2.37 (1.86-3.01) 0.94 (0.71-1.26)

35-44 3,139 5.3 231 7.4 1.73 (1.51-1.99) 1.02 (0.84-1.23)

45-54 7,763 13.2 484 6.2 1.45 (1.3-1.61) 1.12 (0.97-1.29)

55-64 15,375 26.2 743 4.8 1.11 (0.99-1.23) 1.11 (0.98-1.25)

65-74 18,366 31.3 805 4.4 reference reference

75-84 10,747 18.3 412 3.8 0.87 (0.76-0.99) 0.99 (0.86-1.14)

85+ 1,982 3.4 80 4.0 0.92 (0.71-1.18) 1.08 (0.83-1.42)

Gender

Men 27,395 46.7 1,268 4.6 reference 0.014 reference 0.69

Women 31,326 53.3 1,610 5.1 1.12 (1.02-1.22) 1.02 (0.93-1.12)

Deprivation

Lowest 13,301 22.7 488 3.7 reference <0.0001 reference 0.69

2 13,432 22.9 509 3.8 1.03 (0.91-1.18) 1.04 (0.9-1.2)

3 12,498 21.3 559 4.5 1.23 (1.05-1.44) 0.99 (0.85-1.14)

4 10,756 18.3 652 6.1 1.69 (1.36-2.11) 1.03 (0.89-1.2)

Highest 8,734 14.9 670 7.7 2.18 (1.68-2.84) 0.94 (0.8-1.11)

Ethnic Group

British (White) 54,589 93.0 1,177 2.2 reference <0.0001 reference <0.0001

Irish (White) 938 1.6 490 52.2 49.63 (32.72-75.28) 47.73 (41-55.55)

Any other White background 1,066 1.8 485 45.5 37.88 (24.95-57.52) 36.49 (31.52-42.25)

White and Black Caribbean

(Mixed)

72 0.1 50 69.4 103.14 (55.73-190.86) 109.97 (64.65-187.04)

White and Black African (Mixed) 41 0.1 33 80.5 187.19 (90.23-388.35) 202.59 (90.28-454.64)

White and Asian (Mixed) 77 0.1 65 84.4 245.81 (125.89-

479.94)

283.62 (149.43-

538.33)

Any other Mixed background 49 0.1 43 87.8 325.22 (124.17-

851.84)

399.68 (166.26-

960.79)

Indian (Asian or Asian British) 516 0.9 101 19.6 11.04 (7.16-17.04) 8.06 (6.33-10.27)

Pakistani (Asian or Asian British) 206 0.4 46 22.3 13.05 (8.39-20.28) 10.26 (7.19-14.66)

Bangladeshi (Asian or Asian

British)

50 0.1 13 26.0 15.94 (7.44-34.18) 12.02 (6.17-23.41)

Any other Asian background 129 0.2 57 44.2 35.93 (20.28-63.63) 30.42 (20.94-44.2)

Caribbean (Black or Black British) 471 0.8 136 28.9 18.42 (12.64-26.85) 10.37 (8.21-13.1)

African (Black or Black British) 319 0.5 111 34.8 24.22 (16.97-34.56) 15.54 (11.9-20.28)

Any other Black background 10 0.0 9 90.0 408.42 (49.3-3383.53) 410.23 (49.73-

3384.32)

Chinese (other ethnic group) 124 0.2 30 24.2 14.48 (8.77-23.92) 10.66 (6.86-16.57)

Any other ethnic group 64 0.1 32 50.0 45.38 (27.55-74.75) 34.04 (20.05-57.79)

Hospital

OR 95% Reference Range 158 - 30.48 <0.0001 12.85 <0.0001

Table 1. Crude and adjusted predictors of discordant hospital record ethnicity coding

Page 50 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 83: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

24

Total %

N discordant

(ONS6) %

Crude OR of

discordance

joint p-

value

Adjusted OR of

discordance

joint p-

value

Age

16-24 399 0.7 17 4.3 4.15 (2.47-6.96)

<0.0001

0.87 (0.43-1.79)

0.10

25-34 950 1.6 37 3.9 3.78 (2.63-5.42) 1.23 (0.77-1.94)

35-44 3,139 5.3 83 2.6 2.53 (1.99-3.22) 1.32 (0.96-1.82)

45-54 7,763 13.2 166 2.1 2.04 (1.68-2.46) 1.26 (0.98-1.62)

55-64 15,375 26.2 200 1.3 1.23 (1.05-1.44) 1.16 (0.93-1.46)

65-74 18,366 31.3 195 1.1 reference reference

75-84 10,747 18.3 86 0.8 0.75 (0.58-0.98) 0.84 (0.63-1.12)

85+ 1,982 3.4 12 0.6 0.57 (0.3-1.09) 0.73 (0.39-1.39)

Gender

Men 27,395 46.7 321 1.2 reference 0.0003

reference 0.49

Women 31,326 53.3 475 1.5 1.3 (1.13-1.49) 1.06 (0.9-1.26)

Deprivation

Lowest 13,301 22.7 119 0.9 reference

<0.0001

reference

0.30

2 13,432 22.9 120 0.9 1 (0.75-1.34) 1.13 (0.84-1.52)

3 12,498 21.3 143 1.1 1.28 (0.92-1.78) 1.23 (0.92-1.64)

4 10,756 18.3 191 1.8 2 (1.44-2.79) 1.36 (1.02-1.8)

Highest 8,734 14.9 223 2.6 2.9 (2.14-3.93) 1.27 (0.94-1.7)

Ethnic Group

White 56,593 96.4 332 0.6 baseline

<0.0001

baseline

<0.0001

Mixed 239 0.4 179 74.9 505.56 (333.91-765.45) 564.17 (399.47-796.78)

Asian 901 1.5 100 11.1 21.16 (14.5-30.88) 14.3 (10.97-18.65)

Black 800 1.4 123 15.4 30.79 (19.65-48.23) 17.51 (13.35-22.96)

Chinese 124 0.2 30 24.2 54.08 (32.35-90.42) 38.62 (24.39-61.14)

Other 64 0.1 32 50.0 169.46 (103.63-277.12) 109.64 (63.49-189.34)

Hospital

OR 95% Reference

Range 158 - 73.97 <0.0001 22.96 <0.0001

Table 2. Crude and adjusted predictors of discordant hospital record ethnicity coding (ONS 6-group classification)

Page 51 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 84: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

25

White British

Irish

Other White

White & Black Caribbean

White & Black AfricanWhite & Asian

Other Mixed

IndianPakistani

Bangladeshi

Other Asian

Caribbean

African

Other Black

Chinese

Other

020

40

60

80

100

Sensitivity %

0 20 40 60 80 100Positive predictive value %

Figure 2. Sensitivity* and positive predictive value** of hospital record recorded ethnicity compared with

self-reported ethnicity as a gold-standard

*If a patient self-reports that they belong to a particular ethnic group, then the sensitivity of the hospital record

ethnicity coding is the probability that the hospital record will record the same (correct) ethnicity

**If a patient’s hospital record states that they belong to a particular ethnic group, then the positive predictive value of

the hospital record ethnicity code is the probability that this code has been recorded correctly and that the patient will

self-report the same ethnicity

Page 52 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 85: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only 26

Probability (%) of true self-reported ethnic group for each HES code

HES code British

(White)

Irish

(White)

Any other

White

background

White and

Black

Caribbean

(Mixed)

White and

Black

African

(Mixed)

White

and Asian

(Mixed)

Any other

Mixed

background

Indian

(Asian or

Asian

British)

Pakistani

(Asian or

Asian

British)

Bangladeshi

(Asian or Asian

British)

Any other

Asian

background

Caribbean

(Black or

Black British)

African

(Black or

Black

British)

Any other

Black

background

Chinese

(other

ethnic

group)

Any

other

ethnic

group

A = British (White)A 98.1 0.9 0.7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.0 0.0

B = Irish (White) B 20.1 78.7 1.0 0.0 0.0 0.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

C = Any other White backgroundC 53.1 1.2 42.1 0.2 0.1 0.5 0.7 0.2 0.0 0.0 0.5 0.1 0.3 0.1 0.2 0.8

D = White and Black Caribbean (Mixed) D 11.2 0.0 2.8 38.3 0.0 0.0 2.8 0.0 0.0 0.0 0.0 41.1 2.8 0.9 0.0 0.0

E = White and Black African (Mixed) E 12.2 0.0 6.1 4.1 26.5 0.0 8.2 0.0 0.0 0.0 0.0 8.2 34.7 0.0 0.0 0.0

F = White and Asian (Mixed) F 25.0 0.0 2.8 0.0 0.0 40.3 2.8 8.3 1.4 1.4 13.9 1.4 0.0 0.0 1.4 1.4

G = Any other Mixed background G 30.6 0.9 10.8 4.5 4.5 7.2 12.6 4.5 0.9 0.9 3.6 5.4 2.7 0.9 4.5 5.4

H = Indian (Asian or Asian British) H 1.9 0.0 0.4 0.0 0.1 0.6 0.1 88.4 2.8 0.4 4.6 0.2 0.1 0.1 0.0 0.1

J = Pakistani (Asian or Asian British) J 2.3 0.0 0.8 0.0 0.0 0.5 0.3 5.9 87.7 1.0 1.0 0.0 0.0 0.0 0.0 0.5

K = Bangladeshi (Asian or Asian British) K 2.1 0.0 0.0 0.0 0.0 1.1 1.1 5.3 2.1 82.1 4.2 0.0 1.1 0.0 1.1 0.0

L = Any other Asian background L 4.4 0.0 4.2 0.2 0.5 3.7 3.0 21.5 7.6 0.9 41.0 0.5 0.2 0.7 3.5 8.1

M = Caribbean (Black or Black British) M 4.2 0.0 0.1 3.6 0.5 0.0 0.6 0.0 0.0 0.0 0.4 84.8 3.6 2.0 0.0 0.1

N = African (Black or Black British) N 5.0 0.2 1.4 0.6 2.6 0.0 0.6 0.0 0.0 0.0 0.2 6.6 81.3 1.4 0.0 0.2

P = Any other Black backgroundP 3.1 0.0 0.9 3.6 2.7 0.9 1.8 0.4 0.0 0.0 1.3 38.1 42.6 4.0 0.4 0.0

R = Chinese (other ethnic group) R 4.2 0.0 0.8 0.0 0.0 0.8 0.0 0.4 0.0 0.0 4.2 0.0 0.0 0.0 83.2 6.3

S = Any other ethnic group S 43.5 0.6 17.3 0.4 0.9 1.6 1.8 3.5 1.1 0.6 9.1 2.5 3.4 0.9 1.6 11.3

Z = Not stated Z 90.6 1.5 2.8 0.1 0.1 0.3 0.2 0.9 0.5 0.0 0.7 1.0 0.6 0.1 0.3 0.3

X or Missing 91.3 1.3 2.5 0.1 0.1 0.2 0.2 1.3 0.4 0.1 0.5 0.7 0.7 0.1 0.4 0.2

Table 32. The probability (%) of ‘true’ ethnic group for each hospital record-recorded ethnic group (i.e. the probability that a given person with a specific HES-recorded

ethnicity belongs to any (self-reported) group

Formatted Table ... [1]

Formatted ... [2]

Formatted ... [3]

Formatted ... [4]

Formatted ... [5]

Formatted Table ... [6]

Formatted ... [7]

Formatted ... [8]

Formatted ... [9]

Formatted ... [10]

Formatted Table ... [11]

Formatted ... [12]

Formatted ... [13]

Formatted ... [14]

Formatted ... [15]

Formatted Table ... [16]

Formatted ... [17]

Formatted ... [18]

Formatted ... [19]

Formatted ... [20]

Formatted Table ... [21]

Formatted ... [22]

Formatted ... [23]

Formatted ... [24]

Formatted ... [25]

Formatted Table ... [26]

Formatted ... [27]

Formatted ... [28]

Formatted ... [29]

Formatted ... [30]

Formatted Table ... [31]

Formatted ... [32]

Formatted ... [33]

Formatted ... [34]

Formatted ... [35]

Formatted Table ... [36]

Formatted ... [37]

Formatted ... [38]

Formatted ... [39]

Formatted ... [40]

Formatted Table ... [41]

Formatted ... [42]

Formatted ... [43]

Formatted ... [44]

Formatted Table ... [45]

Page 53 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. Protected by copyright. http://bmjopen.bmj.com/ BMJ Open: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. Downloaded from

Page 86: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

27

From 2001-02 onwards: A = British (White)

B = Irish (White)

C = Any other White background

D = White and Black Caribbean (Mixed)

E = White and Black African (Mixed)

F = White and Asian (Mixed)

G = Any other Mixed background

H = Indian (Asian or Asian British)

J = Pakistani (Asian or Asian British)

K = Bangladeshi (Asian or Asian British)

L = Any other Asian background

M = Caribbean (Black or Black British)

N = African (Black or Black British)

P = Any other Black background

R = Chinese (other ethnic group)

S = Any other ethnic group

Z = Not stated

X = Not known

From 1995-96 to 2000-01: 0= White

1 = Black - Caribbean

2 = Black - African

3 = Black - Other

4 = Indian

5 = Pakistani

6 = Bangladeshi

7 = Chinese

8 = Any other ethnic group

9 = Not given

X = Not known

Appendix table 1. NCPES questionnaire ethnicity question (left) and HES ethnicity codes (right)

Page 54 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 87: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

28

Sensitivity of

hospital-recorded

Information

PPV of

hospital-recorded

Information

Ethnicity in 2001 Census groups (16)

British (White) 97.8 (97.7-98) 98.1 (98-98.2)

Irish (White) 47.8 (44.5-51) 81.5 (77.9-84.6)

Any other White background 54.5 (51.5-57.5) 39.3 (36.8-41.8)

White and Black Caribbean (Mixed) 30.6 (20.2-42.5) 41.5 (28.1-55.9)

White and Black African (Mixed) 19.5 (8.8-34.9) 34.8 (16.4-57.3)

White and Asian (Mixed) 15.6 (8.3-25.6) 38.7 (21.8-57.8)

Any other Mixed background 12.2 (4.6-24.8) 12 (4.5-24.3)

Indian (Asian or Asian British) 80.4 (76.7-83.8) 89.1 (85.9-91.7)

Pakistani (Asian or Asian British) 77.7 (71.4-83.2) 86 (80.2-90.7)

Bangladeshi (Asian or Asian British) 74 (59.7-85.4) 80.4 (66.1-90.6)

Any other Asian background 55.8 (46.8-64.5) 38.3 (31.3-45.7)

Caribbean (Black or Black British) 71.1 (66.8-75.2) 85.2 (81.3-88.6)

African (Black or Black British) 65.2 (59.7-70.4) 80.3 (74.9-85)

Any other Black background 10 (0.3-44.5) 0.9 (0-5)

Chinese (other ethnic group) 75.8 (67.3-83) 87.9 (80.1-93.4)

Sensitivity of

hospital-recorded

Information

PPV of

hospital-recorded

Information

White 99.4 (99.3-99.5) 99.6 (99.6-99.7)

Mixed 25.1 (19.7-31.1) 38.2 (30.6-46.3)

Asian or Asian British 88.9 (86.7-90.9) 90.4 (88.3-92.3)

Black or Black british 84.6 (81.9-87.1) 89 (86.5-91.1)

Chinese 75.8 (67.3-83) 87.9 (80.1-93.4)

Other ethnic group 50 (37.2-62.8) 9.7 (6.7-13.5)

Appendix table 2. Sensitivity and PPV. Estimates and 95% CI

Formatted Table

Formatted Table

Formatted: Font: 8 pt

Formatted: Centered

Formatted: Font: 8 pt

Formatted: Centered

Formatted: Font: 8 pt

Formatted: Centered

Formatted: Font: 8 pt

Formatted: Centered

Formatted: Font: 8 pt

Formatted: Centered

Formatted: Font: 8 pt

Formatted: Centered

Page 55 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 88: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

29

Total %

N discordant

(ONS6) %

Crude OR of

discordance

joint p-

value

Adjusted OR of

discordance

joint p-

value

Age

16-24 399 0.7 17 4.3 4.15 (2.47-6.96)

<0.0001

0.87 (0.43-1.79)

0.10

25-34 950 1.6 37 3.9 3.78 (2.63-5.42) 1.23 (0.77-1.94)

35-44 3,139 5.3 83 2.6 2.53 (1.99-3.22) 1.32 (0.96-1.82)

45-54 7,763 13.2 166 2.1 2.04 (1.68-2.46) 1.26 (0.98-1.62)

55-64 15,375 26.2 200 1.3 1.23 (1.05-1.44) 1.16 (0.93-1.46)

65-74 18,366 31.3 195 1.1 reference reference

75-84 10,747 18.3 86 0.8 0.75 (0.58-0.98) 0.84 (0.63-1.12)

85+ 1,982 3.4 12 0.6 0.57 (0.3-1.09) 0.73 (0.39-1.39)

Gender

Men 27,395 46.7 321 1.2 reference 0.0003

reference 0.49

Women 31,326 53.3 475 1.5 1.3 (1.13-1.49) 1.06 (0.9-1.26)

Deprivation

Lowest 13,301 22.7 119 0.9 reference

<0.0001

reference

0.30

2 13,432 22.9 120 0.9 1 (0.75-1.34) 1.13 (0.84-1.52)

3 12,498 21.3 143 1.1 1.28 (0.92-1.78) 1.23 (0.92-1.64)

4 10,756 18.3 191 1.8 2 (1.44-2.79) 1.36 (1.02-1.8)

Highest 8,734 14.9 223 2.6 2.9 (2.14-3.93) 1.27 (0.94-1.7)

Ethnic Group

White 56,593 96.4 332 0.6 baseline

<0.0001

baseline

<0.0001

Mixed 239 0.4 179 74.9 505.56 (333.91-765.45) 564.17 (399.47-796.78)

Asian 901 1.5 100 11.1 21.16 (14.5-30.88) 14.3 (10.97-18.65)

Black 800 1.4 123 15.4 30.79 (19.65-48.23) 17.51 (13.35-22.96)

Chinese 124 0.2 30 24.2 54.08 (32.35-90.42) 38.62 (24.39-61.14)

Other 64 0.1 32 50.0 169.46 (103.63-277.12) 109.64 (63.49-189.34)

Hospital

OR 95% Reference

Range 158 - 73.97 <0.0001 22.96 <0.0001

Appendix table 3. Crude and adjusted predictors of discordant hospital record ethnicity coding (based on 6

combined ethnicity groups)

Page 56 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 89: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

4/25/2013 10:27:00 AM

4/25/2013 10:27:00 AM

4/25/2013 10:28:00 AM

4/25/2013 10:27:00 AM

4/25/2013 10:27:00 AM

4/25/2013 10:27:00 AM

Page 57 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 90: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

4/25/2013 10:27:00 AM

4/25/2013 10:27:00 AM

4/25/2013 10:27:00 AM

4/25/2013 10:28:00 AM

4/25/2013 10:27:00 AM

4/25/2013 10:27:00 AM

Page 58 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 91: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

4/25/2013 10:27:00 AM

4/25/2013 10:27:00 AM

4/25/2013 10:27:00 AM

4/25/2013 10:27:00 AM

4/25/2013 10:28:00 AM

4/25/2013 10:27:00 AM

Page 59 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 92: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

Page 60 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 93: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

1

STROBE Statement—checklist of items that should be included in reports of observational studies

Item

No Recommendation

Title and abstract 1 (a) Indicate the study’s design with a commonly used term in the title or the abstract

Study design is stated in title and in the design section of abstract

(b) Provide in the abstract an informative and balanced summary of what was done

and what was found

Please see Methods/Results section of abstract

Introduction

Background/rationale 2 Explain the scientific background and rationale for the investigation being reported

Given in the background section of the paper - specifically – previous studies of the

quality of coding in routine English NHS records have focused on completeness

rather than accuracy and so this work attempts to fill this research gap

Objectives 3 State specific objectives, including any prespecified hypotheses

Specifically stated as the last line of the introduction

Methods

Study design 4 Present key elements of study design early in the paper

Stated in the abstract and further presented in the methods

Setting 5 Describe the setting, locations, and relevant dates, including periods of recruitment,

exposure, follow-up, and data collection

Stated in the abstract and further presented in Methods (Data)

Participants 6 Cross-sectional study—Give the eligibility criteria, and the sources and methods of

selection of participants

Stated in the abstract and further presented in the methods (data) flowchart (figure 1)

Variables 7 Clearly define all outcomes, exposures, predictors, potential confounders, and effect

modifiers. Give diagnostic criteria, if applicable

Please see Methods, analysis (p9, line 4 onwards)

Data sources/

measurement

8* For each variable of interest, give sources of data and details of methods of

assessment (measurement). Describe comparability of assessment methods if there is

more than one group

Please see Methods, analysis (p9 line 4 onwards)

Bias 9 Describe any efforts to address potential sources of bias

Describing measurement bias in hospital record ethnicity coding is an outcome for

this study

In Methods, analysis, we describe our approach to potential clustering by hospital

Study size 10 Explain how the study size was arrived at

Given in Methods, Data and the flowchart in figure 1

Quantitative variables 11 Explain how quantitative variables were handled in the analyses. If applicable,

describe which groupings were chosen and why

All variable groups are explained and described in methods, analysis

Statistical methods 12 (a) Describe all statistical methods, including those used to control for confounding

Described in methods, analysis

(b) Describe any methods used to examine subgroups and interactions

No subgroup analyses considered

(c) Explain how missing data were addressed

Page 61 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 94: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

2

We performed these analyses (considering the small proportion of cases with missing

ethnicity coding) and mention these results in our discussion. These analyses were

extensive but not the main focus of this paper and so we did not include them as a

substantial research output in this paper

Cross-sectional study—If applicable, describe analytical methods taking account of

sampling strategy

Discussion of the generalizability of the sample was included in the discussion.

Clustering at the hospital level was considered in the analysis, looking at results with

and without a random effect for hospital.

(e) Describe any sensitivity analyses

We repeated the analyses on data from 2011/2, but as information on deprivation for

these years was not recorded and the results were similar across both years we only

present the results for 2010 (including the data from 2011/2 in the probability lookup

table calculation only)

Discordance of ethnicity using ONS 16 and ONS 6 categorisations is also presented

(the latter appendix).

Continued on next page

Page 62 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 95: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

3

Results

Participants 13* (a) Report numbers of individuals at each stage of study—eg numbers potentially eligible,

examined for eligibility, confirmed eligible, included in the study, completing follow-up, and

analysed

Figure 1

(b) Give reasons for non-participation at each stage

Figure 1

(c) Consider use of a flow diagram

Figure 1

Descriptive

data

14* (a) Give characteristics of study participants (eg demographic, clinical, social) and information

on exposures and potential confounders

Given as part of table 1

(b) Indicate number of participants with missing data for each variable of interest

Methods (data) section and figure 1

Outcome data 15*

Cross-sectional study—Report numbers of outcome events or summary measures

Table 1, given in the text of the results

Main results 16 (a) Give unadjusted estimates and, if applicable, confounder-adjusted estimates and their

precision (eg, 95% confidence interval). Make clear which confounders were adjusted for and

why they were included

Table 1; crude and adjusted results presented with 95% confidence intervals.

(b) Report category boundaries when continuous variables were categorized

Done so for age in table 1.

(c) If relevant, consider translating estimates of relative risk into absolute risk for a meaningful

time period

Not applicable

Other analyses 17 Report other analyses done—eg analyses of subgroups and interactions, and sensitivity

analyses

Results from sensitivity analyses (considering different categorisations of ethnicity) presented

in the appendix.

Discussion

Key results 18 Summarise key results with reference to study objectives

In first paragraph of discussion

Limitations 19 Discuss limitations of the study, taking into account sources of potential bias or imprecision.

Discuss both direction and magnitude of any potential bias

In discussion

Interpretation 20 Give a cautious overall interpretation of results considering objectives, limitations, multiplicity

of analyses, results from similar studies, and other relevant evidence

In discussion

Generalisability 21 Discuss the generalisability (external validity) of the study results

Discussed as in the discussion section of this paper, and also as part of the last bullet point in

‘Article Summary’.

Other information

Funding 22 Give the source of funding and the role of the funders for the present study and, if applicable,

Page 63 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 96: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

4

for the original study on which the present article is based

In acknowledgements section

*Give information separately for cases and controls in case-control studies and, if applicable, for exposed and

unexposed groups in cohort and cross-sectional studies.

Note: An Explanation and Elaboration article discusses each checklist item and gives methodological background and

published examples of transparent reporting. The STROBE checklist is best used in conjunction with this article (freely

available on the Web sites of PLoS Medicine at http://www.plosmedicine.org/, Annals of Internal Medicine at

http://www.annals.org/, and Epidemiology at http://www.epidem.com/). Information on the STROBE Initiative is

available at www.strobe-statement.org.

Page 64 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 97: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

107x90mm (300 x 300 DPI)

Page 65 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 98: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

Page 66 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 99: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

From 2001-02 onwards: A = British (White) B = Irish (White) C = Any other White background D = White and Black Caribbean (Mixed) E = White and Black African (Mixed) F = White and Asian (Mixed) G = Any other Mixed background H = Indian (Asian or Asian British) J = Pakistani (Asian or Asian British) K = Bangladeshi (Asian or Asian British) L = Any other Asian background M = Caribbean (Black or Black British) N = African (Black or Black British) P = Any other Black background R = Chinese (other ethnic group) S = Any other ethnic group Z = Not stated X = Not known

From 1995-96 to 2000-01: 0= White 1 = Black - Caribbean 2 = Black - African 3 = Black - Other 4 = Indian 5 = Pakistani 6 = Bangladeshi 7 = Chinese 8 = Any other ethnic group 9 = Not given

X = Not known

Appendix table 1. NCPES questionnaire ethnicity question (left) and HES ethnicity codes (right)

Page 67 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from

Page 100: BMJ Open...Ahmed F (1) Lyratzopoulos G (1) Cambridge Centre for Health Services Research, University of Cambridge, Institute of Public Health, Forvie Site, Robinson Way, Cambridge,

For peer review only

Sensitivity of hospital-recorded

Information

PPV of hospital-recorded

Information

Ethnicity in 2001 Census groups (16) British (White) 97.8 (97.7-98) 98.1 (98-98.2)

Irish (White) 47.8 (44.5-51) 81.5 (77.9-84.6)

Any other White background 54.5 (51.5-57.5) 39.3 (36.8-41.8)

White and Black Caribbean (Mixed) 30.6 (20.2-42.5) 41.5 (28.1-55.9)

White and Black African (Mixed) 19.5 (8.8-34.9) 34.8 (16.4-57.3)

White and Asian (Mixed) 15.6 (8.3-25.6) 38.7 (21.8-57.8)

Any other Mixed background 12.2 (4.6-24.8) 12 (4.5-24.3)

Indian (Asian or Asian British) 80.4 (76.7-83.8) 89.1 (85.9-91.7)

Pakistani (Asian or Asian British) 77.7 (71.4-83.2) 86 (80.2-90.7)

Bangladeshi (Asian or Asian British) 74 (59.7-85.4) 80.4 (66.1-90.6)

Any other Asian background 55.8 (46.8-64.5) 38.3 (31.3-45.7)

Caribbean (Black or Black British) 71.1 (66.8-75.2) 85.2 (81.3-88.6)

African (Black or Black British) 65.2 (59.7-70.4) 80.3 (74.9-85)

Any other Black background 10 (0.3-44.5) 0.9 (0-5)

Chinese (other ethnic group) 75.8 (67.3-83) 87.9 (80.1-93.4)

Sensitivity of

hospital-recorded Information

PPV of hospital-recorded

Information

White 99.4 (99.3-99.5) 99.6 (99.6-99.7)

Mixed 25.1 (19.7-31.1) 38.2 (30.6-46.3)

Asian or Asian British 88.9 (86.7-90.9) 90.4 (88.3-92.3)

Black or Black british 84.6 (81.9-87.1) 89 (86.5-91.1)

Chinese 75.8 (67.3-83) 87.9 (80.1-93.4)

Other ethnic group 50 (37.2-62.8) 9.7 (6.7-13.5)

Appendix table 2. Sensitivity and PPV. Estimates and 95% CI

Page 68 of 68

For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml

BMJ Open

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

on August 25, 2021 by guest. P

rotected by copyright.http://bm

jopen.bmj.com

/B

MJ O

pen: first published as 10.1136/bmjopen-2013-002882 on 28 June 2013. D

ownloaded from