Record Linkage Using The Northern Ireland Longitudinal Study GSS Seminar on Data Matching
description
Transcript of Record Linkage Using The Northern Ireland Longitudinal Study GSS Seminar on Data Matching
Record Linkage Using
The Northern Ireland Longitudinal Study
GSS Seminar on Data Matching
Mon 29 November, London
Fiona Johnston
NILS Research Support Unit
Introduction to the Northern Ireland Longitudinal Study (NILS) incl. the Northern Ireland Mortality Study (NIMS)
Record Linkage Methodology using the NIMS:
Issues and Biases
Research Based on the NIMS: Exemplar Projects & Findings
Research Support & Future Plans
Presentation Outline
1. Research-Driven Cross-sectional studies: no information on change over time Other UK LS Other international mortality-based LS Health and socio-demographic profile of NI
2. Legislation Confidentiality protected, and managed by NISRA, under census legislation NISRA have consulted the following:
Information Commissioner for Northern Ireland Office of Research Ethics Health and Social Care Privacy Advisory Committee
3. Funding Infrastructure funded by the Health and Social Care R&D Division and
NISRA Research support function funded by ESRC and NI Government (OFMDFM)
Background to the NILS and NIMS
1. Northern Ireland Longitudinal Study (NILS) – 28% representative sample of NI population (c. 500,000), based on health card registrations, linked to:
2001 Census returns vital events (births, deaths and marriages) demographic & migration events AND
distinct Health & Care datasets
2. Northern Ireland Mortality Study (NIMS) - enumerated population at Census Day (c.1.6 million), linked to:
2001 Census returns subsequently registered mortality data
Both NILS and NIMS linked to contextual and area-based data:
capital value of houses and property attributes geographical indicators settlement classifications deprivation measures
Overview of the NILS and NIMS
Contextual data NIMS Core Data Events
2001 Census 1.6m enumerated
DeathsNIMS Database
Individual Project Datasets
Geographic indicators
Property characteristics
Area characteristics
Structure of the NIMS
Datasets Routinely Linked to the NIMSCensus Datasets 2001 Contextual Datasets
Age, sex and marital status Religion and community backgroundFamily, household or communal typeHousing, including tenure, rooms and amenitiesCountry of birth, ethnicity Educational qualificationsEconomic activity, occupation and social classMigration (between 2000 and 2001)Limiting, long-term illness, self-reported general health, care-givingTravel to work
LPS Property Data 2010Capital and rating value (based on 2005 valuation exercise)- Household characteristics (no. of rooms, property type, floor space, central heating) and valuation - Estimated capital value
Geographical IndicatorsSuper Output AreaWard Local Government District
Settlement ClassificationsUrban/Rural/Mixed
Deprivation Measures 2005 & 2010Multiple Deprivation MeasureIndividual Deprivation Domains
GRO Death Events Datasets
Deaths of sample members
NIMS Record Linkage Methodology
NIMS database based on 1.6m pop. at Census 2001
GRONI deaths data added to NIMS database on a six monthly basis
3 stage matching process: exact computer matching fuzzy computer matching detailed manual searching
Create and run matching queries: accept exact matches manually confirm/refute fuzzy matches clerical searching for unmatched records check for duplicates and resolve
Linkage rates close to 100% not possible for NIMS – why?
1. Non-enumeration at Census:
One Number Census methodology: imputation for adjusted est. total Imputation varies by age, gender and geographical area In NI enumerated 2001 Census total was 1,603,641 - an additional
81,626 people were imputed = overall imputation rate of 4.6%.
2. People who came to NI after 2001 and subsequently died: selective unrecorded migration
3. Differences between the info collected on census form and death certificate
Record Linkage: Issues and Biases
Study on potential biases:
O’ Reilly, D., Rosato, M. & Connolly, S. (2008) Unlinked vital events in census-based longitudinal studies can bias subsequent analysis.
Journal of Clin. Epid. 61: 380-385.
What are the characteristics of people whose events
are not linked into the LS datasets?
What does this mean for analyses using the LS?
Record Linkage: Issues and Biases
Record Linkage Rates 2001-2005
59,396 deaths available from to be linked from 2001-2005
6% deaths (3,392) could not be matched
Process Number (%)
All death records NI 59,396
Exact matches 45,496 (80.6)
Fuzzy matches 4,491 (8.0)
Manual matches 2,093 (3.7)
Linkage through HCR 951 (1.7)
Unlinked 3,395 (6.0)
Characteristics of matched & non-matched deaths
Based on data from death records and compared by Multivariate LogisticRegression:
Year of registration
Socio-demographic details age, sex, marital status, social class (NS-SEC)
Place of death home, hospital, nursing/residential home
Area in which they lived (SOA) Deprivation (Income domain), Urban/rural Population density Imputation
Cause of death
Age and sex distribution of unlinked death records
0
200
400
600
800
1000
1200
0-4 5-910-
1415-
1920-
2425-
2930-
3435-
3940-
4445-
4950-
5455-
5960-
6465-
6970-
7475-
7980-
84 85+
Males
Females 0
5
10
15
20
25
30
35
40
45
50
0-4
10-14
20-24
30-34
40-44
50-54
60-64
70-74
80-84
Malesales
Females
Number of Deaths Proportion of Deaths
Variation according to demographic characteristics (deaths and results of log. regression) 2001-2006
Aged less than 65 Aged more than 65
Sex Deaths OR Deaths OR
Male 8,130 1.00 25,443 1.00
Female 4,941 0.63 *** 31,775 0.92*
Marital status
Married 7,398 1.00 19,450 1.00
Single 3,549 1.57 *** 8,873 2.83 ***
Widowed 776 1.40 *** 27,758 1.97 ***
Sep/Divorced 1,348 2.52 *** 1,137 3.30 ***
Place of death
Home 6,066 1.00 13,378 1.00
N/R home 1,009 1.05 12,771 2.00 ***
Hospital 5,996 0.80 *** 31,069 1.28 ***
*** P<0.001; ** P< 0.01; * P<0.05
Variation according to relative deprivation (deaths and results of log. regression) 2001-2006
Aged less than 65 Aged more than 65
Deaths Odds ratios Deaths Odds ratios
Least Deprived
1,831 (6.8%) 1.00 10,543 (5.7%)
1.00
2nd 2,137 (8.8%) 1.19 11,103 (5.4%)
0.90
3rd 2,554 (9.5%) 1.20 11,933 (6.0%)
0.93
4th 2,901 (10.4%)
1.20 11,534 (5.2%)
0.84 *
Most Deprived
3,530 (16.0%)
1.78 *** 11,374 (7.2%)
1.23 **
*** P<0.001; ** P< 0.01; * P<0.05
Variation by cause of death (deaths and results of log. regression) 2001-2006
All ages Under 65 years old
Deaths (%unmatched) Deaths (%unmatched)
All causes 70,289 (6.9%) 13,071 (11.1%)
I.H.D 13,970 (5.6%) 2,064 (9.4%)
Stroke 7,211 (6.8%) 542 (8.9%)
Respiratory Disease 9,722 (7.0%) 802 (9.9%)
Cancer 18,572 (5.6%) 4,846 (8.1%)
All External causes 2,634 (15.2%) 1,648 (20.3%)
Accidents 1,719 (12.3%) 830 (18.2%)
Suicides 702 (19.9%) 649 (21.4%)
Other Causes 12,840 (8.9%) 2,579 (13.6%)
Research conclusions: small proportion of events are not linked – biases:
increase in months immediately after Census Day 2001 increase with ‘distance’ from the census are non-random and more frequent in …
• younger males, older females• people who are perhaps more socially isolated• amongst residents of nursing/residential homes• deprived areas• where enumeration is low
Non-linkage may limit the ability to study some causes of death
and potentially lead to an underestimation of social gradients
Record Linkage: Issues and Biases
potential biases
yet: statutory obligation to record death events and is therefore complete & good quality data – long experience of use for mortality analyses ANDalways be biases in every linkage study ≠100% - this research shows that biases can be quantified
small number problems i.e. falling death rates, population sub-groups (minority ethnics), cause-specific mortality (suicides, trauma & specific cancers)
yet: can increase length of follow-up study, aggregate sub-populations & increase cohort size
However ….
Health & Mortality:
Temperature-related mortality and housing (DSD) Socio-demographic and area correlates of suicides Distribution of cancer deaths in Northern Ireland by population and household type (NI
Cancer Registry) Variations in alcohol related deaths in Northern Ireland
Demography:
Vital events: Standard Table Outputs (DMB)
Section 75 (Equality Analyses)
Equality assessment of health outcomes: cause-specific mortality for Section 75 groups (DHSSPS)
Mortality rates and life-expectancy: Section 75 groups and social disadvantage (OFMDFM)
Religious affiliation and self-reported health Denominational differences in short-term mortality Mortality risk for carers
Research Based on the NIMS
Exemplar Project & Research Findings
A study of the socio-demographic and area correlates of suicides in NI (Project 005)
O’ Reilly, D., Rosato, M., Connolly, S. and Cardwell, C. (2008) Area factors and suicide: 5-year follow-up of the Northern Ireland population. Br J Psychiatry 192(2):106-11.
Background:Suicide rates vary between areas: is this due to individual characteristics (composition) or area characteristics (context)?
Aim:To determine if area factors are independently related tosuicide risk after adjustment for individual and familycharacteristics.
Method:A 5-year record linkage study, based on the NIMS database, was conducted of c. 1.1 million individuals (not in communal establishments) aged 16–74 years, enumerated at the 2001 Northern Ireland Census. - data anonymised and held in a safe setting
Area Factors & Suicide (i)
Results:
1. The cohort experienced 566 suicides during follow-up.
2. Suicide risks:i. lowest for women and for those who were married or
cohabiting;ii. strongly related to individual and household
disadvantage and economic and health status.
3. The higher rates of suicide in the more deprived and socially fragmented areas disappeared after adjustment for individual and household factors.
4. There was no significant relationship between population density and risk of suicide.
Area Factors & Suicide (ii)
Conclusions:
Differences in rates of suicide between areas are predominantly due to population characteristics rather than to area-level factors.
Policy implication? Policies targeted at area-level factors are unlikely to significantly influence suicides rates.
Area Factors & Suicide (iii)
Suicide (Daily Mirror)
NILS Research Support Unit Based: Centre for Public Health (QUB) and NISRA HQ (McAuley House) Support: 2 full-time and 1 half-time Research Support Officers Set-up: April 2009
Remit:
raise awareness of the NILS research potential;
assist with development of research ideas and projects;
facilitate access to NILS data;
training & advice in use and analysis of NILS datasets;
promote policy relevance; and
enhance NILS research capacity incl: specific duty to assist government researchers and to undertake exemplar public policy research.
Research Support
NILS data are sensitive and access is highly controlled:
researchers can access data only within a ‘secure setting’ (NILS-RSU office at McAuley House); arrangements can be made to run analyses remotely;
researchers must sign and abide by user licenses & security policies;
disclosure control thresholds in place to protect confidentiality of the data: no tabulated cell counts less than 10; and
all outputs must be cleared by NISRA staff.
Access
Ongoing/Pending:
Inter-Censal Migration Flows
Mortality after death of a spouse: Is risk the same for all groups?
Religion, fertility and space: impacts on the future school population of Northern Ireland.
Exploratory analysis of the use of antibiotics by demographic and area characteristics
Potential: Pharmaco-epidemiological studies using Prescribing data Cancer research Northern Ireland Cancer Registry data Hospital admissions using Hospital Inpatient System dataHealthcare Associated Infections using laboratory testing data
Current Project Activity
The help provided by the staff of the Northern Ireland Longitudinal Study and the Northern Ireland Mortality
Study (NILS and NIMS) and the NILS Research Support Unit is acknowledged.
The NILS and NIMS are funded by the Health and Social Care Research and Development Division of the Public
Health Agency (HSC R&D Division) and NISRA. The NILS-RSU is funded by the ESRC and Northern Ireland
Government.
The authors alone are responsible for the interpretation of the data.
Acknowledgements
NILS Research Support UnitNorthern Ireland Statistics and Research Agency
McAuley House2-14 Castle Street
BelfastBT1 1SA
Tel: 028 90 348138
Email: [email protected] Website: nils-
rsu.census.ac.uk