Data Matching Pilot Evaluation

download Data Matching Pilot Evaluation

of 159

Transcript of Data Matching Pilot Evaluation

  • 8/2/2019 Data Matching Pilot Evaluation

    1/159

    Data matching schemesto improve accuracy and

    completeness of the electoral

    registers evaluation reportMarch 2012

  • 8/2/2019 Data Matching Pilot Evaluation

    2/159

    Translations and other formats

    For information on obtaining this

    publication in another language or in

    a large-print or Braille version, please

    contact the Electoral Commission:

    Tel: 020 7271 0500

    Email: [email protected]

    The Electoral Commission 2012

  • 8/2/2019 Data Matching Pilot Evaluation

    3/159

    Contents

    Executive summary 1

    1 Introduction 13

    2 Set-up and coordination 21

    3 Databases and the matching process 28

    4 Pilot authorities: overview and emerging issues 36

    5 Data matching results: Department for Work and Pensions 44

    6 Data matching results: Driver and Vehicle Licensing Agency 63

    7 Data matching results: Education databases 71

    8 Data matching results: Ministry of Defence 75

    9 Data matching results: Citizen Account 79

    10 Pilot costs 82

    11 Conclusions and recommendations 92

    Appendices

    Appendix A: Local authority pilot profiles 99Appendix B: Data tables 147

  • 8/2/2019 Data Matching Pilot Evaluation

    4/159

    Acknowledgements

    The Electoral Commission would like to thank all the staff at both the local

    authorities and the data holding organisations for the time and effort they

    devoted to these data matching pilot schemes.

    We would also like to thank the Cabinet Office for their assistance in the

    collection of data from the pilots.

  • 8/2/2019 Data Matching Pilot Evaluation

    5/159

    1

    Executive summary

    BackgroundAs part of the proposed shift to individual electoral registration (IER), the UK

    Government is exploring the extent to which the use of national public

    databases can help Electoral Registration Officers (EROs) improve the accuracy

    and completeness of their electoral registers.

    The Electoral Commission was given a statutory responsibility to report on the

    effectiveness of the data matching schemes. The schemes were based on the

    piloting of a range of national public databases in 2011 by 22 individual localauthorities in England and Scotland.1 Our statutory evaluation considers the

    degree to which data matching schemes assisted EROs in improving the

    completeness and accuracy of their registers; resulted in any issues around

    administration, time and costs; or prompted objections to the schemes. Our

    findings are based on the data and feedback received from local authorities,

    data-holders and others during the course of the pilot schemes.

    In February 2012 the UK Government published its response to pre-legislative

    scrutiny and public consultation on the IER White Paper. In the response, the UKGovernment indicated its intention subject to the results of the evaluation of

    pilot schemes and further testing to widen the scope of data matching

    simplify the transition to IER for 2 The UK Government

    indicated that rather than, as originally intended,

    check accuracy and to identify people who may be eligible to register to vote,

    and then invite them to apply to register,3 it was now their intention

    names and addresses of all individuals currently on an electoral register will be

    matched against the data held by public bodies such as the Department for

    Work and Pensionsinformation can be matched, the individual will be automatically placed onto the

    new IER register and would not need to take any further action to be registered

    1 At the outset there was a pilot authority in Wales but they dropped out early in the process.2 HM Government (2012) Government Response to pre-legislative scrutiny and public

    consultation on Individual Electoral Registration and amendments to Electoral Administration

    law, Cm 8245

    3 HM Government (2011) Individual Electoral Registration, Cm 8108

  • 8/2/2019 Data Matching Pilot Evaluation

    6/159

    2

    4 Electors whose details could not be matched in this way would be

    asked to apply individually and to supply personal identifiers.

    This proposal has not been tested by these pilots. Further piloting is needed to

    ensure that the advantages and disadvantages of these proposals are

    understood. In this report we set out some of the key questions we think need to

    be answered to help understand the issues.

    Set-up and coordination

    The Cabinet Office managed the overall pilot process and was also

    involved in the delivery of the pilots. The Commission advised the Cabinet

    Office throughout the set-up period, in particular on the need for a clear

    common framework for delivering the pilots, but the final decisions on theprocesses were taken by the Cabinet Office.

    The open application and selection process used for the pilots, and the

    absence of a clear, common framework, contrary to our advice, led to

    significant variation in the planned approaches of the pilots. This

    introduced challenges for the evaluation in comparing the results of the

    different schemes and therefore the ability to draw consistent conclusions.

    This also created challenges for local authorities delivering the pilots

    because in several cases the methodology they originally planned to use

    proved to be based on incorrect assumptions.

    The timing of the pilots, which took place alongside the annual canvass,

    coupled with delays to the process, put pressure on the capacity of the

    local authority teams involved and added to the difficulty in this evaluation

    of drawing firm conclusions from the pilot schemes as a whole.

    Local authorities reported varying levels of communication with the

    Cabinet Office and identified areas for improvement, including a betterunderstanding of what data was going to be shared with them.

    The pilots did not follow processes, in terms of the IT systems and

    matching arrangements, which would be used for nationwide data

    4 HM Government (2012) Government Response to pre-legislative scrutiny and public

    consultation on Individual Electoral Registration and amendments to Electoral Administration

    law, Cm 8245

  • 8/2/2019 Data Matching Pilot Evaluation

    7/159

    3

    matching. The evaluation cannot therefore draw conclusions about how

    the costs of these pilots would translate to a national roll-out.

    The databases and the matchingprocess

    Ten databases were due to be tested as part of the scheme. These were

    the:

    Department for Work and Pensions (DWP) Centric database

    Driver and Vehicle Licensing Agency (DVLA) Driver database

    Student Loans Company (SLC) databaseNational Pupil Database (NPD) (through the Department for Education)

    Individual Learner Record (ILR) (through the Department for Business,

    Innovation and Skills)

    Citizens Account (CA) database (through the Improvement Service in

    Scotland)

    (MoD) Joint Personnel Administration database

    and Anite housing database

    Higher Education Funding Council for England (HEFCE) student

    databaseRoyal Mail change of address database

    However, not all were tested to the same extent. In particular, there were

    difficulties in accessing HEFCE and Royal Mail data.

    A key component of the trials was the matching process between

    databases to identify people to invite to register to vote. The processes

    employed could not be rolled out nationally but do allow for a greater

    understanding of the requirements of any framework for national datamatching.

    Two different processes were used for matching the electoral registers with

    the DWP data on one hand and the DVLA and education databases on the

    other. These different rules make it difficult to compare the results from the

    two processes.

  • 8/2/2019 Data Matching Pilot Evaluation

    8/159

    4

    Pilot authorities: overview and emerging

    issues

    There were 22 data matching pilots testing various combinations of

    databases.

    The complex nature and format of the data supplied to local authorities

    highlighted the need for good data management and analysis skills. Many

    of the pilots either had these skills available within the local authority or

    recruited additional staff using Cabinet Office funding. However, severaldid not and struggled to use the information provided.

    The process, as tested in these pilots, was labour intensive with significant

    work required to analyse the data. Those involved felt that the level of work

    required would not be sustainable in the future.

    A number of pilot authorities were able to use locally-held data to

    interrogate the data received from the national databases. The results of

    this activity suggest that there is scope for more use to be made of localdata both to complement any future national data matching and to

    improve accuracy and completeness in general.

    Data matching results

    Department for Work and Pensions (DWP)

    Eighteen pilots accessed the DWP Centric database.

    The level of match between the electoral registers and the DWP data

    varied significantly between local authorities. For those areas matching the

    whole register it ranged from 57.6% to 82.4%. These differences are partly

    due to different interpretations across the pilots, in the absence of a

    consistent framework, of what constitutes a match but are also likely to

    driven by differences between the local authorities in terms of

    demographics.

  • 8/2/2019 Data Matching Pilot Evaluation

    9/159

    5

    6,573 people were added to the registers as a result of follow-up activity

    undertaken using names suggested by the DWP Centric database. This

    was 13.2% of all names followed up.

    The response rate for the pilot follow up was affected by whether theass.

    Where it took place during the canvass, the pilot response was depressed

    by the fact that many of the names identified by the match registered

    through the canvass.

    The pilots highlighted crucial differences in address formats between the

    electoral registers and the other national databases. This meant that many

    records could not be matched as simple address differences were not

    recognised. This problem could have been significantly reduced if time

    had been allowed for an address cleansing exercise.

    The absence of a unique identifier attached to each address on the public

    national databases was a key issue for the pilots. These would have

    allowed for a more straightforward matching process for local authorities.

    Many of the potential new electors suggested by the match with the DWP

    Centric database proved to be based on out of date or incorrect

    information. The problems posed by this could have been reduced by the

    inclusion of the date when the DWP record changed something whichDWP were willing to provide but were not asked to do so.

    The absence of nationality information meant that several pilot authorities

    conducted follow up with people ineligible to register. However, the scope

    of this issue is not clear from these pilots and will have varied depending

    on the demographics of the local authority area.

    Driver and Vehicle Licensing Agency (DVLA) Match levels between the DVLA driver database and the electoral registers

    were lower than between the registers and the DWP Centric database,

    partly because

    because the match process used stricter criteria.

    The match levels varied from 51.7% to 67.3%.

    this was more likely to reflect poor data currency rather than significantunder-registration.

  • 8/2/2019 Data Matching Pilot Evaluation

    10/159

    6

    208 were added to the register as a result of pilot follow-up activity. This

    was 4.1% of all names followed up.

    Many of the responses to follow-up activity indicated the person written to

    was not resDVLA Driver database is not current.

    The DVLA data was more effective at targeting 16 and 17 year olds as

    opposed to the population as a whole.

    Education databases

    There were very few registrations from data matching with the Student

    Loans Company (SLC) database. This, and responses to the follow-up

    activity, support the view expressed by the SLC that the data used for

    these pilots (at the end of the academic year) was sometimes out of date.

    The National Pupil Database (NPD) and Individual Learner Record (ILR)

    proved effective at identifying attainers5 in these pilots.

    However, while the NPD and ILR identified attainers successfully, the

    majority of registrations were achieved through the annual canvass, which

    was taking place alongside the pilots, and not in response to follow-upactivity through the pilots. Under IER, unlike in the current household

    system, individual attainers might need to complete their own form (rather

    than being registered by adults in the household). It is therefore possible

    that the number of registered attainers will fall. The ability to use data in

    order to target them in this way may therefore be a more useful tool for

    EROs in the future.

    Ministry of Defence (MoD) The MoD provided limited data for these pilots. They were able to confirm

    that existing service voters were still resident but not provide details of

    potential new service voters. They also provided details of addresses

    occupied by service personnel in the area but this excluded barracks.

    5 An attainer is a 16- or 17-year-old who will reach voting age (18 years old) during the life of a

    current electoral register.

  • 8/2/2019 Data Matching Pilot Evaluation

    11/159

    7

    There was therefore no real prospect of addressing the completeness of

    service voter registrations in the pilot areas.

    Two pilots were able to use the MoD data to improve the accuracy of their

    register and amended or deleted a number of their records (9.6% and13.2%) of the total number of service voters held on the register.

    Citizens Account (CA)

    The CA database is administered by the Improvement Service in Scotland

    and is intended to be a record of all residents within a participating local

    authority area.

    However, the CA database is not as comprehensive as the pilot authority

    originally anticipated the total number of records provided by CA

    represented only 27% of the Renfrewshire electorate.

    The level of match between the CA data and the electoral register was high

    with 88.8% of the CA records also found on the register.

    The matching exercise suggested a small number of potential new

    electors (1.7% of the size of the register after local matching).

    Follow-up activity was still under way at time of publication.

    Pilot costs

    The overall cost of the pilots is estimated at around 425,910, against an

    original budget of 1.2m. These figures exclude staff costs for the Cabinet

    Office.

    The under-spend is largely explained by initial budgeting over-estimates byboth Cabinet Office and the pilot authorities, due to a lack of clarity about

    what the pilot process would entail, and by many local authorities not

    completing some of the activities, e.g. follow-up work, which they originally

    planned for. Given the size of the over-estimates, it seems unlikely that the

    pilots would ever have cost the full amount budgeted.

    The main item of expenditure reported by local authorities is the costs of

    additional staff, which account for about 50% of the total spent by local

    authorities. Local authorities reported that the process was labour intensive

  • 8/2/2019 Data Matching Pilot Evaluation

    12/159

    8

    and they needed to incur much of this cost before they could begin the

    process of contacting potential new electors.

    Staff costs could be reduced by improving the quality of the data matched

    and automating more of the process but we cannot conclude, from the

    information gathered in these pilots, what the cost would be of any national

    data matching roll out.

    There was some limited expenditure on databases that were not used in

    the pilots and so did not deliver any benefit.

    While the costs of these pilots appear high in terms of numbers of people

    added to the registers this does not mean that data matching could not be

    cost effective if implemented differently.

    In order to assess potential scalability of data matching, it would be

    necessary to have more consistent information than is available about the

    costs incurred, and this information should include the additional internal

    costs incurred by both local authorities and data-holding organisations.

    Conclusions

    Our conclusions broadly follow the statutory evaluation criteria set out in

    Sections 35 and 36 of the Political Parties and Elections Act 2009. These criteria

    concern the degree to which data matching schemes assisted EROs in

    improving the completeness and accuracy of their registers; resulted in any

    issues around administration, time and costs; or prompted objections to the

    schemes.

    The registration objectives: completeness and accuracy

    On the whole, these pilots did not prove very effective at getting new electors onto the registers. Despite the efforts invested by authorities in the data pilots, very

    few additions (only 7,917) were subsequently made to the registers.

    However, better results were achieved where the local authority was able to

    begin their pilot follow-up activity before, or at a very early stage, of their annual

    canvass. This was largely because, where the follow up did not begin until later,

    many people had already registered through the canvass.

    In these pilots, the most useful databases in terms of adding people to theregisters were those which targeted specific under-registered groups (e.g. 16-

  • 8/2/2019 Data Matching Pilot Evaluation

    13/159

    9

    and 17-year-olds) such as the National Pupil Database (NPD) and the Individual

    Learner Record (ILR).

    The issues surrounding the currency of address information on some of the

    other databases would need to be addressed in order to improve their

    effectiveness at finding new electors.

    However, the low number of registrations does not mean that the principle of

    data matching is not worth pursuing further and many local authorities were

    clear that they still see potential in it. Refinements to the matching process

    such as improvements to the currency, quality and compatibility of the data

    provided would need to be in place before this objective could be fully tested.

    In relation to improving the accuracy of the registers, the MoD data was useful,

    up to a point, at helping EROs to amend or delete the records of service voters.Two local authorities amended or deleted a number of their records,

    representing 9.6% and 13.2% of the total number of service voters held on the

    register.

    However, there was limited testing of the usefulness of the other databases for

    improving accuracy with only one pilot providing information to the Commission

    on this aspect.

    Finally, not all the public national databases included in the current scheme

    were tested to the same degree. As set out earlier in the report, there was no

    testing of HEFCE or Royal Mail data by the pilot authorities and we are unable to

    draw any conclusions about the usefulness of this data in addressing the

    registration objectives.

    Objections to the schemes

    At the outset there were concerns that the use of public data in this way could

    generate objections from the public. However, where data has been provided,

    local authorities indicated they received few objections to the schemes. Where

    local authorities did receive queries, the vast majority of people were content

    with the use of the data when the purposes of the schemes were explained to

    them.

    This indicates that the data matching pilots did not generate any substantial

    level of concern amongst the public. However, any future testing or roll out of

    data matching would need to be well implemented in order to ensure there is

    continued public support.

  • 8/2/2019 Data Matching Pilot Evaluation

    14/159

    10

    Ease of administration

    Many pilots raised concerns that in its current format the process of data

    matching was too labour intensive for regular use. Additional staff resource was

    required by many of the authorities. In the main, this tended to be due to thelarge volumes of data received, issues with data compatibility and the workload

    involved in sorting the data for use.

    Many authorities also emphasised the need to understand the skill sets required

    for this kind of activity, and highlighted that in many cases these skills were not

    held by those currently working on registration activities. In the interim,

    developing the process of local data matching would not only be useful to EROs

    in maintaining the registers but would also help to build skills which could be

    used to understand and manipulate data provided from national databases.

    Time and costs

    The pilot schemes proved to be both time consuming and costly. However, it is

    not possible to draw robust conclusions about the long-term cost effectiveness

    of data matching from these pilots as the processes used here would not be

    repeated in a nationwide system of data matching.

    Nevertheless, it is clear that unless the process is made substantially more

    straightforward, it is doubtful that many authorities would have had the

    resources available to undertake data matching without additional finance

    which, in this case, was provided by the Cabinet Office.

    Recommendations

    This section sets out our recommendations for future data matching activities.

    Pilot processesFurther testing of national databases by local authorities would need to beundertaken in order to establish whether data matching is made availablefor use by all local authorities.Any further testing needs to be set up in a way that addresses the limitations set

    out in this report in order to ensure that meaningful data can be collated. The

    Electoral Commission would encourage the Government to consult us in detail

    in order to achieve this.

  • 8/2/2019 Data Matching Pilot Evaluation

    15/159

    11

    We recommend that any further piloting (with a focus on improving accuracy

    and completeness): takes place outside of the annual canvass period and avoids other

    significant electoral events. Piloting data matching alongside the annualcanvass added a layer of complexity to the testing process and meant it

    was harder for local authorities to isolate the impact of the data matching

    as opposed to canvass response rates. It also had consequences for local

    authority capacity to utilise the data when it was available to them. Several

    EROs thought that data matching could have more use following the

    canvass to pick up new registrants in the run-up to elections.

    has a clear framework for the use of data that all participatingauthorities can follow. This current scheme allowed local authorities toadopt varying approaches to piloting the data they received. The differing

    methodologies meant it was harder to draw conclusions about the

    effectiveness of the data and thus the future of the registration system. A

    clear framework would help to ensure comparability between the pilots but

    still allow for some local differences for example, targeting particular

    groups and making use of local databases.

    tests, as closely as possible, the process which would be madeavailable to all local authorities if data matching was to be rolled outnationally.

    ensures that participating areas are sufficiently staffed and haveappropriate expertise to complete the pilot and test the data provided.

    allows for a better understanding of the benefits of access to nationaldata compared to existing local databases.

    allows for a clearer analysis of the cost of data matching through moreinformed budgeting and prescribed reporting of costs incurred.

    ensures that good communication between the pilots, the dataholders and the Cabinet Office is maintained throughout the process.

    Databases

    In relation to the specific databases included in these schemes:

  • 8/2/2019 Data Matching Pilot Evaluation

    16/159

    12

    There is merit in re-testing nearly all of the databases included inthese pilots providing the specific issues identified in this evaluationare addressed, namely that:

    address format compatibility issues should be mitigated wherepossible. The planned inclusion of Unique Property ReferenceNumbers (a unique identifier for each address held) on the DWP

    database will help with this issue,as will plans for a single nationaladdress file. Other mitigating steps could betaken for matches withother databases, for example using address cleansingsoftware.Data currency issues should be tackled by ensuring that, wherepossible, the information shared includes details of the dates onwhich database records are updated.

    We would not recommend further testing of the MoD data, unless therange of data which can be shared is increased. While the datasupplied in these schemes was useful for the pilot authorities it is likely to

    Proposals for verifying identity

    As outlined earlier, the Government is currently considering whether the results

    from the data matching exercise could be used to confirm the identity of

    individuals captured by the household canvass during the transition to IER. In

    relation to this, we recommend that:

    There is a need for more evidence to support this proposal, given thatthis is was not an objective of these pilots . Any future piloting thatincludes this as an objective for testing should allow for an analysis of

    matched and non-matched records in order to check the accuracy of the

    matching process used. It is possible that this analysis could use the

    annual canvass process. As a result the timing of these pilots may need tobe slightly different to that of any pilots focused on accuracy and

    completeness.

    These plans should also stay abreast of developments in the. There are other initiatives

    within government on the processes that might be used in the future to

    verify identity. Learning lessons and adopting best practice from these

    other initiatives is important in order to ensure that the approach to

    verification followed under IER, and therefore the security of the registers,

    is as robust as possible.

  • 8/2/2019 Data Matching Pilot Evaluation

    17/159

    13

    1 Introduction

    1.1 statutory

    evaluation of the 2011 data matching pilot schemes. The schemes were based

    on matching a range of national databases by Electoral Registration Officers

    (EROs) in 22 local authorities. This was the first time that EROs have been able

    to test the usefulness of national data for improving the quality of their electoral

    registers.

    1.2 The overall aim of the pilot schemes was for EROs to test whether national

    public databases can help to improve the accuracy and completeness of their

    electoral registers.

    Background

    Accuracy and completeness of the electoral registers

    1.3 Electoral registers underpin elections by providing the list of those who are

    eligible to vote. Those not included on the registers cannot take part in

    elections. Registers are also used for other important civic purposes, including

    selecting people to undertake jury service, and calculating electorates to informParliamentary and local government boundary reviews, which are the basis for

    ensuring representative democracy. People not registered are therefore not

    counted for these purposes either.

    1.4 In addition, credit references agencies may purchase complete copies of

    electoral registers, which they use to confirm addresses supplied by applicants

    for bank accounts, credit cards, personal loans and mortgages.

    1.5 Great Britain does not have one single electoral register. Rather, each local

    authority appoints an ERO who has responsibility for compiling an accurate and

    complete electoral register for their local area.

    1.6 AccuracyThe accuracy of the electoral registers is therefore a measure of the percentage

    of entries on the registers which relate to verified and eligible voters who are

    resident at that address. Inaccurate register entries may relate to entries which

    have become redundant (for example, due to people moving home), which are

    for people who are ineligible and have been included unintentionally, or which

    are fraudulent.

  • 8/2/2019 Data Matching Pilot Evaluation

    18/159

    14

    1.7 Completenesstherefore refers to the percentage of eligible people who are registered at their

    current address. The proportion of eligible people who are not included on the

    register at their current address constitutes the rate of under-registration.

    1.8 Great Br

    20116provided the first national estimates of the completeness of the electoral

    registers since estimates of the 2000 England and Wales registers, as well as

    the first national estimates of the accuracy of the registers since 1981. This

    study was funded by the Cabinet Office in order to inform the development of

    the approach to the introduction of individual electoral registration (IER).

    1.9 The research estimated the April 2011 Parliamentary registers to be 82.3%

    complete; the comparable figure for the local government registers was 82.0%.This equates to approximately 8.5 million unregistered people in Great Britain as

    of April 2011.However, this does not mean that these registers should have had

    8.5 million more entries, because many, but not all, of those not registered

    correctly may still have been represented on the registers by an inaccurate entry

    (for example, at a previous address).

    1.10 The April 2011 parliamentary registers were 85.5% accurate; thecomparable figure for the local government registers was 85.4%.

    1.11 The research also demonstrates the extent to which both the accuracy and

    completeness of the registers deteriorate between the publication of the

    registers in December each year and the time when elections are usually held in

    the following spring. Although in December 2010 the estimated number of

    people not registered in Great Britain was at least six million, by April 2011 the

    number had grown to around 8.5 million (17.7%).

    Current system of updating the electoral registers

    1.12 At present, EROs use an annual canvass and rolling registration to update

    their registers. Individual electors can register to vote throughout the year by

    However,

    most updates to the registers take place during the annual canvass, which is

    undertaken each autumn. At its simplest, the canvass involves delivering a

    6http://www.electoralcommission.org.uk/__data/assets/pdf_file/0007/145366/Great-Britains-electoral-registers-2011.pdf

    http://www.electoralcommission.org.uk/__data/assets/pdf_file/0007/145366/Great-Britains-electoral-registers-2011.pdfhttp://www.electoralcommission.org.uk/__data/assets/pdf_file/0007/145366/Great-Britains-electoral-registers-2011.pdfhttp://www.electoralcommission.org.uk/__data/assets/pdf_file/0007/145366/Great-Britains-electoral-registers-2011.pdfhttp://www.electoralcommission.org.uk/__data/assets/pdf_file/0007/145366/Great-Britains-electoral-registers-2011.pdfhttp://www.electoralcommission.org.uk/__data/assets/pdf_file/0007/145366/Great-Britains-electoral-registers-2011.pdfhttp://www.electoralcommission.org.uk/__data/assets/pdf_file/0007/145366/Great-Britains-electoral-registers-2011.pdf
  • 8/2/2019 Data Matching Pilot Evaluation

    19/159

    15

    registration form to each household and following up, via postal reminders and

    personal visits, those households who do not respond. Revised registers are

    then published on 1 December.

    1.13 Almost all EROs use locally held data, such as council tax and housing

    records, to improve the effectiveness of their registration activity. However,

    EROs have not been able to make use of national databases in order to improve

    the quality of their local registers.

    Data matching and the move to individual electoral

    registration

    1.14 The previous UK Government, during the passage of the Political Parties

    and Elections Act 2009 (PPE Act), introduced legislation providing for thephased introduction of individual electoral registration (IER) in Great Britain. The

    PPE Act made provision for IER to be introduced in accordance with a statutory

    timetable. The PPE Act also included provisions to allow data matching pilot

    schemes to be carried out, with a view to establishing which national public

    databases might be useful to EROs in helping maintain electoral registers

    during the transition to IER.1.15 Under the PPE Act, data matching schemes approved by the Secretary of

    State would require a public or local authority to supply an ERO with data which

    they could then use for the purpose of maintaining complete and accurate

    registers.1.16 In June 2011 the Coalition Government published a White Paper setting

    out its plans to speed up the implementation of IER in Great Britain. The new

    system to be implemented from 2014 will require each elector to register

    individually (unlike the current system where registration takes place

    predominantly by household) and to supply personal information for verification

    purposes prior to names being added to the electoral register.

    1.17 The IER White Paper explained that the UK Government would explore,

    o

    identify people eligible to vote but missing from the register so they can invite7 If successful the Government indicated that it would look at

    how data matching used in this way could be extended across the country and

    support the move to IER.

    7 HM Government (2011) Individual Electoral Registration, Cm 8108, p11.

  • 8/2/2019 Data Matching Pilot Evaluation

    20/159

    16

    1.18 In February 2012 the UK Government published its response to pre-

    legislative scrutiny and public consultation on the IER White Paper.8 In the

    response, the UK Government indicated its intention subject to the results of

    the evaluation of pilot schemes and further testing to widen the scope of data

    The UKGovernment indicated that rather than only using data matching to identify

    potential electors, it was now their

    individuals currently on an electoral register will be matched against the data

    held by public bodies such as the DWP and local authorities themselves

    that I be matched, the individual will be

    automatically placed onto the new IER register and would not need to take any9 Electors whose details could not be

    matched in this way would be asked to apply individually and to supply personal

    identifiers.

    1.19 The UK Government has acknowledged that this would represent a

    significant change to the position set out in the White Paper, which envisaged all

    potential electors applying individually and supplying personal identifiers, with

    data matching used as a means of identifying potential electors. It stated its

    an efficient and effective system ready in time to support the implementation of10

    The Electoral Registration Data Schemes Order 2011

    1.20 The Electoral Registration Data Schemes Order 2011 (the 2011 Order),

    made on 9 June 2011, gave effect to proposals by local authorities to run data-

    matching schemes. Under the 2011 Order, an agreement between the data-

    holding organisation and the ERO needed to be in place before personal data

    could be shared between the two parties. The purpose of the agreement was to

    explain governance arrangements for data transfer and matching, explain the

    expected outputs and inputs for this process, set out information security

    standards, and detail timescales.

    1.21 The Cabinet Office was responsible for the selection and coordination of

    the schemes. The process for recruiting local authorities to run pilots was by

    8 HM Government (2012) Government Response to pre-legislative scrutiny and public

    consultation on Individual Electoral Registration and amendments to Electoral Administration

    law, Cm 8245.9 Ibid.

    10 Ibid.

  • 8/2/2019 Data Matching Pilot Evaluation

    21/159

    17

    open application, with the Government wanting to see how people responded to

    the idea of using national databases to help maintain the electoral register.

    Aims and objectives of pilots

    1.22 The overall aim of the pilots was for EROs to test whether public

    databases can be useful for improving the accuracy and completeness of their

    electoral registers. However, in practice, the majority of pilots were more

    focused on completeness (finding people eligible to vote but missing from the

    register) than they were on accuracy (finding and removing inaccurate entries

    on the register).

    1.23 The detailed objectives of the schemes varied due to the open application

    process and lack of a common framework. Each authority submitted a proposalon how they would undertake a data matching exercise based on particular

    challenges in their area. These proposals varied in terms of both scale and

    focus. For example, some pilots matched their whole register with the available

    data while others targeted particular wards with historically low response rates to

    the annual canvass. Some areas were particularly focused on certain

    demographic groups, e.g. attainers or the over-70s, while others looked at all

    residents. The objectives of individual pilot schemes are examined in more

    detail later in this report.

    Role of the Commission

    1.24 The Commission was given a statutory responsibility to report on the

    effectiveness of the data matching schemes. The approach we have adopted is

    based on the requirements for an evaluation set out in Sections 35 and 36 of the

    PPE Act.

    1.25 The PPE Act

    a description of the scheme

    an assessment of the extent to which the scheme assists the ERO in

    meeting the registration objectives11, which are:

    that persons who are entitled to be registered on a register are

    registered on it

    11 Registration objectives are set out in Section 31.8 of the PPE Act 2009.

  • 8/2/2019 Data Matching Pilot Evaluation

    22/159

    18

    that persons who are not entitled to be registered on a register

    are not registered on it, and

    that none of the information relating to a registered person that

    appears on a register or other record kept by a registration

    officer is false

    whether there was an objection to the scheme, and if so how much

    how easy the scheme was to administer

    the extent to which the scheme resulted in savings of time and costs, or

    the opposite

    anything else specified in the order under Section 35. The 2011 Order did

    Our approach

    1.26 Our approach to the evaluation has been based on our statutory

    responsibilities outlined above. We have assessed:

    The administration of the pilots: the way the schemes were run, anydifficulties experienced or lessons learned by local authorities, data

    holders, other organisations involved and the objections to the scheme.

    Data quality: the potential for data matching to improve the registrationprocess.

    Resources: resources and skills necessary for administering the pilots,their costs and the extent to which data matching can result in cost and

    time savings.

    1.27 We worked with the Cabinet Office during the set-up of the schemes withthe aim of allowing for an effective evaluation of each pilot as well as the

    schemes as a whole. We emphasised in particular the desirability of consistency

    across key components of the schemes (methodology, matching and follow-up

    process), in order that the findings could be compared across areas and across

    databases. However, the final decisions made by the Cabinet Office did not

    always reflect the advice given and no clear, common framework for the pilots

    was established.

  • 8/2/2019 Data Matching Pilot Evaluation

    23/159

    19

    1.28 Together with the Cabinet Office, we monitored the work of the

    participating local authorities throughout the running of the schemes and were in

    contact with the authorities to provide assistance and address issues.

    1.29 The evaluation is based on a range of qualitative and quantitative data

    collected before, during and at the end of the process. Data and other evidence

    were collected from:

    Questionnaires from local authorities: each authority submitted aproposal before the start of the pilots which outlined their objectives and

    their approach to delivering the scheme.

    Data from local authorities: we designed a template, with the input of theCabinet Office, for collecting data from the local authorities about the

    various databases and the results from the follow-up activities. Localauthorities were asked to submit interim data (between August and

    October) and a final return with all results by 14 December 2011. However,

    not all authorities met this deadline or provided data in the format

    requested.

    Evaluation report from local authorities: all authorities were required tosubmit an evaluation of their pilot by 23 December 2011 using a template

    designed by us and the Cabinet Office. The report covers the key areas of

    the evaluation.

    Interviews with local authorities: we conducted individual interviews witheach participating local authority between the end of October 2011 and the

    beginning of January 2012.

    Interviews with data-holders and software suppliers: we also conductedinterviews with those organisations that hold the datasets being tested in

    the pilots and software suppliers who had assisted local authorities with

    the data.

    Regular contact with the Cabinet Office: we liaised closely with theCabinet Office throughout the project and were part of a Registration

    Improvements Board, which monitored the progress of the pilots.

    This report

    1.30 This report considers the effectiveness of the data matching schemes in

    improving the accuracy and completeness of the electoral registers.

  • 8/2/2019 Data Matching Pilot Evaluation

    24/159

    20

    1.31 The remainder of this report is divided into the following:

    Chapter 2 summarises the set-up and coordination of the pilot schemes

    by the Cabinet Office, including details of the selection process and issues

    relating to the timing of the pilots.

    Chapter 3 summarises the national databases included in the pilot

    schemes.

    Chapter 4 sets out details of each specific pilot area and issues

    encountered by the pilots in delivering the schemes.

    Chapters 5, 6, 7, 8 and 9 set out the key data, provided by the local

    authority pilots, for each of the national databases accessed. It reviews the

    quality of data returned to each local authority and the usefulness of thatdata in meeting the registration objectives set out for the schemes.

    Chapter 10 summaries the costs of the data matching schemes.

    Chapter 11 summarises the key findings and recommendations.

  • 8/2/2019 Data Matching Pilot Evaluation

    25/159

    21

    2 Set-up and coordination

    2.1 This chapter sets out how the pilots were set up and coordinated by the

    Cabinet Office. It also considers the impact of the approach to the management

    of the pilots on the findings of the evaluation.

    Key points The Cabinet Office managed the overall pilot process and was also

    involved in the delivery of the pilots. The Commission advised the Cabinet

    Office throughout the set-up period, in particular on the need for a clear,

    common framework for delivering the pilots, but the final decisions on the

    processes were taken by the Cabinet Office.

    The open application and selection process used for the pilots, and the

    absence of a clear, common framework, contrary to our advice, led to

    significant variation in the planned approaches of the pilots. This

    introduced challenges for the evaluation in comparing the results of the

    different schemes and therefore the ability to draw consistent conclusions.

    This also created challenges for local authorities delivering the pilots

    because in several cases the methodology they originally planned to useproved to be based on incorrect assumptions.

    The timing of the pilots, which took place alongside the annual canvass,

    coupled with delays to the process, put pressure on the capacity of the

    local authority teams involved and added to the difficulty in this evaluation

    of drawing firm conclusions from the pilot schemes as a whole.

    Local authorities reported varying levels of communication with the

    Cabinet Office and identified areas for improvement, including a better

    understanding of what data was going to be shared with them.

    The pilots did not follow processes, in terms of the IT systems and

    matching arrangements, which would be used for nationwide data

    matching. The evaluation cannot therefore draw conclusions about how

    the costs of these pilots would translate to a national roll-out.

  • 8/2/2019 Data Matching Pilot Evaluation

    26/159

    22

    Overview

    2.2 The

    encompassed:

    design of the pilot framework, including drafting of secondary legislation

    that set out how the pilots were to operate and when the pilots were to be

    undertaken

    issuing to all local authorities an invitation to participate, and selecting

    which areas were to take part in the scheme

    overseeing the delivery of the pilots by local authorities

    negotiating with data-holders to allow for matching to take place

    ensuring appropriate confidentiality and data security agreements were in

    place with participating areas and data-holders

    developing the data matching process

    for some databases, overseeing the match with the register

    providing funding for the scheme and overseeing payments to data-holding organisations and local authorities

    2.3 There are several aspects of the set up and management of the schemes

    which introduced challenges for local authorities delivering the pilots. They have

    also made it more difficult for our evaluation to draw clear conclusions on the

    success of the pilots. These are considered below.

    Selection of the pilot schemes

    2.4 The Cabinet Office issued an invitation, in September 2010, to all local

    authorities in Great Britain to pilot data matching. To participate, authorities were

    required to submit a proposal outlining their objectives for data matching, how

    they would deliver the scheme, and estimated costs. Each of the authorities

    then provided further information about the proposed delivery of their pilot in

    order to inform the selection process.

    2.5 The Cabinet Office assessed the ability of the authorities to meet the

    requirements of the scheme and selected participants based on the quality oftheir application, taking into account the demographic groups they wanted to

  • 8/2/2019 Data Matching Pilot Evaluation

    27/159

    23

    target, any innovative ideas they proposed and the estimated budget for the

    activity. The geographic spread of the final group of selected authorities was

    also considered. The final group of pilots was chosen in January 2011 and the

    statutory instrument for the schemes was confirmed in June 2011.

    Variable methodologies

    2.6 As noted, local authorities were encouraged to submit their own proposals

    and suggestions as to how data matching might work in their area. The Cabinet

    rationale for the open application process was to allow local authorities

    to identify ways in which data matching might help them to address the

    particular challenges or target audiences relevant to their local area.

    2.7 While there are advantages to encouraging ideas and innovative

    approaches from local authorities, we consistently stressed, in our advice to the

    Cabinet Office, the need for a clear framework for the pilots, which would

    provide consistency in delivery and therefore allow for an effective evaluation.

    We also formally raised this need as part of our response to the Cabinet Office

    consultation on the Electoral Registration Data Schemes Order 2011 and the

    Representation of the People (Electoral Registration Data Schemes) Regulations

    2011.12

    2.8 However, no specific instructions were given to local authorities about

    how to implement the schemes, and no clear framework was put in place toensure consistent delivery, although some support was available from the

    Cabinet Office.

    2.9 The absence of a clear framework for delivery meant that a wide variety of

    approaches were adopted for implementing pilots and this wide variation has

    made it more difficult to draw clear comparisons in this evaluation. For example,

    authorities differed in how they treated the match scores in the data returned to

    them (for more information see Chapter 5). A register entry which was matched

    against an entry on the Department for Work and Pensions Centric (DWP)database would be scored between 10 and 100 depending on the exact nature

    of the match. Some areas chose to treat all scores above 55 as a match while

    others chose all scores above 80. The Cabinet Office did not attempt to impose

    any standardisation of approach. This has implications when comparing the

    quality of the results across local authorities.

    12www.electoralcommission.org.uk/__data/assets/pdf_file/0011/117695/Electoral-

    Commission-consultation-response-Data-matching-SI.pdf

  • 8/2/2019 Data Matching Pilot Evaluation

    28/159

    24

    2.10 The approach adopted for contacting people identified as a result of data

    matching varied across authorities, involving either one or more letters to names

    identified or one or more visits by canvassers to the addresses associated with

    those names, or a combination of both letters and visits. The variety of

    approaches taken complicates the analysis of the results, as a high responserate in one pilot may have more to do with the use of canvassers than the quality

    of the data for that area.

    2.11 Finally, the open nature of the application process meant that the initial

    proposals also often made assumptions about certain processes or criteria

    being in place for delivering the pilot schemes. Consequently, when some of

    these assumptions proved to be incorrect, authorities struggled to deliver the

    data matching scheme. As one local authority set out in their evaluation report:

    maybe for future work, the Cabinet Office needs to be a little more

    prescriptive on the processes and outcomes it requires.

    Timing

    2.12 The data matching schemes had originally been due to commence in June

    2011 with all activities (including evaluations) to be completed by September

    2011. During this set-up period, we emphasised the importance of avoiding

    significant overlap with the annual canvass.

    2.13 However, the Cabinet Office decided to allow pilot activity to continue until

    the end of November 2011 with evaluations taking place afterwards. In addition,

    there were delays at the outset that compounded the problem. The authorities

    had expected to receive the data in late June 2011. However, due to delays in

    ensuring all the necessary technical arrangements and data access agreements

    were in place, the matching of the registers did not commence until July August

    for most authorities. These delays meant that a number of authorities had to

    adapt their approach to testing the data because they no longer had the

    resources available to manage the process or because they had anticipated

    contacting residents in advance of their canvass beginning, but were no longer

    able to do so. One local authority commented:

    Slipping of the timetable made it impossible to complete the pilot as it

    was first intended.

    2.14 Running the pilots alongside the annual canvass added a complicating

    factor both for the delivery of the pilot schemes and also for assessing the value

    of data matching. It also had an impact on the capacity and resources ofauthorities to use the data returned to them (these issues are considered further

  • 8/2/2019 Data Matching Pilot Evaluation

    29/159

    25

    in Chapter 4). It was in anticipation of these problems that we raised concerns

    during the initial planning phase about the proposed timing of the schemes.

    Control groups

    2.15 For most areas it was not therefore feasible to contact local residents

    before the annual canvass had begun across their area. To address this issue,

    we encouraged pilots to create control groups of names identified from the

    national data, where no dedicated follow up would take place and the names

    would subsequently be tracked in the annual canvass.

    2.16 This was intended to determine how many would have been registered

    anyway in the absence of the pilot. However, not all the authorities were able to

    put in place a clear process for separating out the canvass from the data

    matching activities and often people identified to be followed up by letter were

    found to have already registered through the canvass.

    2.17 For the purposes of this evaluation this means that data on the response

    rates for those names followed up by pilot authorities has to be viewed in the

    context of how the authority was able to manage the two processes of the pilot

    and the annual canvass.

    2.18 As the example below shows, in many cases the fact that people were

    registering through the canvass depressed the response rate to letters issuedthough the pilot process.

  • 8/2/2019 Data Matching Pilot Evaluation

    30/159

    26

    Effect of the canvass on pilot response ratesThe matching process suggests 500 names that appear to be resident in the

    area (because they appear on another database) but are not found on the

    electoral register.

    The canvass has already begun by the time the authority is in a position to write

    to these individuals and when the 500 names are checked against canvass

    returns 150 are found to have registered already.

    The pilot can only therefore write to the remaining 350 names.

    From the 350 letters issued, 50 respondents register to vote equating to a 14%

    response. There are fewer responses because it is very likely that there will be

    proportionately more incorrect names, ineligible people or people less likely toregister among the 350 than among the original 500. This is mainly because 150

    people who are resident, eligible and interested have already been removed.

    But if all 150 had responded to the letter as they did to the canvass form the

    response rate would have been 40% and even if only half (75) had responded it

    would still have been notably higher at 25%.

    Communication

    2.19 The Cabinet Office had intended to run monthly meetings with the pilot

    areas. While some meetings took place, they were less frequent and more

    sporadic than had originally been anticipated. Notwithstanding this, the Cabinet

    Office also made themselves available to local areas to discuss issues and

    this was noted by several authorities. For example, one authority reported:

    Throughout the project general communication with the Cabinet Office

    and the provision of update information was effective.

    2.20 However, some areas also commented that in the immediate run-up to the

    matching taking place they noticed decreasing contact from the Cabinet Office.

    They did not feel fully informed about changes to the process and in some

    instances noted that queries went unanswered.

    2.21 For example, they had expected that the data returned from the DWP

    would include unique property reference numbers to ensure that addresses on

    their electoral register could be found on the DWP database. They had also

    thought that the data would include dates of record changes. Neither of these

  • 8/2/2019 Data Matching Pilot Evaluation

    31/159

    27

    elements was included in the data returned. Some of these things were crucial

    to the effectiveness of the pilots and are discussed further below.

    2.22 Several pilot authorities also indicated that they did not know what the

    format or layout of the data matching results would be before they were sent to

    them. Practically, these issues meant there was a period of confusion among

    several pilots when they initially received the results of the matching activity.

    2.23 Several pilot authorities and the DWP felt that it would have been beneficial

    to have had more direct communication, rather than always using the Cabinet

    Office as a go-between. This may have helped to ensure the pilots were more

    up to date about the process and in a better position to interpret the outputs

    from the matching process.

    Scalability

    2.24 The technical matching processes and the IT systems used in these pilots

    could not be scaled up and rolled out across Great Britain. For example, data

    files were sent to and from local authorities by email, with some matching

    carried out by DWP directly and some by a team within the Cabinet Office (see

    Chapter 3 for more details). The approach worked for this limited number of

    pilots but would not be sustainable for every local authority in Great Britain.

    2.25 This also has an impact on the analysis of the costs of these pilots as the

    individual budgets relate to processes which would not be replicated. As a result

    this evaluation can make only limited comment on the value for money of data

    matching.

    2.26 Nonetheless, running the data matching schemes has allowed for a

    greater understanding of the requirements of any framework for national data

    matching.

  • 8/2/2019 Data Matching Pilot Evaluation

    32/159

    28

    3 Databases and the

    matching process3.1 This chapter considers the national public databases that were included in

    the pilot schemes. As noted above, the Cabinet Office arranged for access to a

    range of public databases through discussions with the relevant data-holding

    organisations. The databases that could be accessed by each pilot were then

    set out in the statutory instrument for the schemes.13

    Key points Ten databases were due to be tested as part of the scheme.

    However, not all were tested to the same extent. In particular, there were

    difficulties in accessing Higher Education Funding Council England

    (HEFCE) and Royal Mail data.

    A key component of the trials was the matching process between

    databases to identify people to invite to register to vote. The processes

    employed could not be rolled out nationally but do allow for a greater

    understanding of the requirements of any framework for national data

    matching.

    Two different processes were used for matching the electoral registers with

    the Department for Work and Pensions (DWP) data on one hand and the

    Driver and Vehicle Licensing Agency (DVLA) and education databases on

    the other. These different rules make it difficult to compare the results from

    the two processes.

    Overview of databases

    3.2 Broadly, the databases fall into two groups:

    P Centric database (everyone

    with a national insurance number), and the DVLA driver database (the

    13www.cabinetoffice.gov.uk/sites/default/files/resources/schemes-order-draft.pdf

    http://www.cabinetoffice.gov.uk/sites/default/files/resources/schemes-order-draft.pdfhttp://www.cabinetoffice.gov.uk/sites/default/files/resources/schemes-order-draft.pdfhttp://www.cabinetoffice.gov.uk/sites/default/files/resources/schemes-order-draft.pdfhttp://www.cabinetoffice.gov.uk/sites/default/files/resources/schemes-order-draft.pdf
  • 8/2/2019 Data Matching Pilot Evaluation

    33/159

    29

    Department for Transport estimated that in 2010 80% of men and 66% of

    women had a driving licence)

    the education and Ministry of Defence databases

    3.3 Table 1 sets out which databases were included in the statutory

    instrument. It also sets out the coverage of each database and a brief overview

    of how they are updated. While each database contains different information,

    the pilots only accessed the specific fields needed to match to the electoral

    registers: name, full address and, in some cases, date of birth (so although, for

    Centric database includes national insurance numbers this

    information was not included in the data supplied to local authorities).

    Access to the data

    3.4 Between them, the participating authorities were due to test all the

    databases included in Table 1. However, there were some difficulties in

    accessing some of the databases, which meant that authorities were not able to

    use them as had originally been anticipated.

    HEFCE data3.5 HEFCE decided not to provide the data directly to local authorities, instead

    restricting access to a computer screen at the Cabinet Office. This meant thatlocal authorities could not adequately compare the data against their registers

    or locally held data. It also prevented them from testing the quality of the data

    through contacting any of the names on the HEFCE database but not on the

    register.

    Royal Mail data3.6 There were delays in Royal Mail agreeing and signing the Article 4

    agreement which was required before data could be transferred. By the time

    that the legal agreements were in place only one pilot was still interested in thedata (Colchester). The data was therefore matched but was only available to be

    sent to Colchester on 30 November when the staff at the local authority were

    participating in strike action. The data was therefore not sent to Colchester

    although they would not have been able to make significant use of it at that

    point anyway.

  • 8/2/2019 Data Matching Pilot Evaluation

    34/159

    30

    Table 1: Databases key informationOrganisation Database Coverage of pilot data UpdatesDepartment for Work and

    Pensions (DWP)

    Centric All those with either a national

    insurance number or a child

    reference number

    I daily by a

    range of sources including benefits offices,

    pension providers and employers

    Driver and Vehicle

    Licensing Agency (DVLA)

    Driver database All those holding a provisional or Driver details are updated online or by form

    when the driver provides the informationDepartment for Education

    (DfE)

    National Pupil Database

    (NPD)

    All pupils in state or partially state-

    funded schools in England

    Information is collected annually from each

    school via the relevant local authorities

    Department for Business,

    Innovation and Skills (BIS)

    Individual Learner

    Record (ILR)

    All learners at state-funded further

    education institutes

    Information is collected at set points during

    the year

    Student Loans Company

    (SLC)

    Customer database All current students with a loan or

    grant

    Student initiated: details are updated online,

    by phone or by form

    Ministry of Defence (MoD) Joint Personnel

    Administration

    All service voters Ad hoc updates by individual service voters

    Anite housing database All addresses classed as service

    family accommodation

    Centrally managed by Anite

  • 8/2/2019 Data Matching Pilot Evaluation

    35/159

    31

    Table 1: Databases key information (continued)Organisation Database Coverage of pilot data UpdatesImprovement service14 Citizens Account All individuals who chose to

    maintain an electronic record

    Ad hoc updates by individuals and updates

    linked from other sources (where consent

    has been given)

    Higher Education Funding

    Council England (HEFCE)

    Higher Education

    Statistics Agency (HESA)

    individualised student

    record

    All students at state-funded higher

    education institutions

    Information is collected annually from higher

    education institutions

    Royal Mail Change of Address All those who register their change

    of address with the Royal Mail

    Information is provided directly by home

    movers close to the time they move house

    14 The Improvement Service is a partnership between the Convention of Scottish Local Authorities (COSLA) and the Society of Local Authority Chief Executives

    (SOLACE). It is a company limited by guarantee.

  • 8/2/2019 Data Matching Pilot Evaluation

    36/159

    32

    The matching process

    3.7 The first step in the process was for participating areas to provide their

    electoral registers (either to the data-holding organisation or to the Cabinet

    Office) for matching. The matched data was then returned to the local

    authorities, who used the data to decide who to contact to register. In practice

    this meant that, following interrogation, local authorities followed up names

    found on the national databases and not on their register. Figure 1 below

    illustrates how the process of data matching broadly worked.

    Figure 1: The pilot processAll or some of electoral register extracted by pilot authority

    produced detailing results of process

    results against local data sources for an additional level of check

    Sent as an encrypted ZIP file by secure

    email to DWP/Cabinet Office or MoD

    Sent as an encrypted ZIP file by secure

    email to relevant pilot authority

    Identify names to be followed up either to encourage registration

    or to query validity of existing registration

    Follow up activity e.g. issuing letters or sending canvassers

    Responses resulting in new registration, deletion, amend or no

    action

  • 8/2/2019 Data Matching Pilot Evaluation

    37/159

    33

    3.8

    the national database records they had been matched against with the

    accompanying match score (see below for further information). It also contained

    those records which did not match either register entries or national database

    entries.

    Data transfer arrangements3.9 Throughout these pilot schemes data was transferred as attachments by

    secure email. However, one data-holding organisation stressed to us that this

    was not their preferred method for sending sensitive data and that their future

    involvement in any further piloting would be at least partially dependent on more

    robust data transfer processes being put in place.

    3.10 In addition, the use of email attachments led, in one instance, to the match

    file for one local authority (containing electoral register entries and data from one

    national database) being returned to another. In this case the mistake was

    swiftly identified and the local authority that wrongly received the data deleted

    the file. However, it is clearly important that any future data matching system

    (potentially involving hundreds of local authorities) avoids such errors.

    Variation in matching processes

    3.11 Although Figure 1 provides a generic step-by-step guide to the data

    piloting process there were three separate processes in relation to theaccessing and matching of the registers to the different databases:

    For the match with data from the DWP Centric database, the matching

    process was carried out by the DWP and the results provided to the local

    authorities.

    For the match with the MoD data, the matching of personnel records to the

    register was completed by the MoD and the results provided to local

    authorities.

    The matching with all other databases was carried out by staff within the

    Cabinet Office and the results were returned to authorities in a single

    .

    Matching process for the Department for Work and Pensions3.12 The process used for matching against the DWP Centric database was a

    new, previously untested approach, designed by the Cabinet Office. It used the

    first name (F), surname (S), first line of address (A) and the postcode (P) from

    register entries in order to match them against the DWP Centric database. Each

  • 8/2/2019 Data Matching Pilot Evaluation

    38/159

    34

    examples where there is a small difference in spelling, where one names sounds

    like another or where one name includes another, e.g.

    3.13 A score was assigned to a match depending on the interaction of the four

    variables and whether they were exactly or fuzzily matched. In the list below

    fuzzy matches are denoted by an apostrophe. So, for example, a score of 80

    would be awarded for a fuzzy match first name and surname and an exact

    match postcode and first line of address.

    F S P A = 100

    = 99

    = 95

    = 94

    = 90

    F S A = 85

    F P A = 50

    F S P = 65

    = 60

    = 45

    = 40

    P A = 20

    = 10

    3.14 The matching process used for all the other databases, apart from those

    owned by the MoD, was explained by the Cabinet Office as follows:

    The matching process used for all the other databases (apart from

    MoD) was based on a complex matching process contained in an IBM

    proprietary product (IBM was commissioned to provide the central hub

    services). This approach either marked each record as unmatched orgave it a score ranging from 81 to 118. The matching algorithm used in

    this process was very sophisticated but (unlike at DWP) tended only to

    identify firm matches in the great majority of cases.

    Impact of variation3.15 For the purposes of evaluating the comparative strengths and weaknesses

    of these databases in updating the electoral registers, the use of several

    processes was not ideal. It also added to confusion among the local authorities

    over how to interpret the data and hampered attempts to cross referenceinformation provided by the DWP with information from other databases.

  • 8/2/2019 Data Matching Pilot Evaluation

    39/159

    35

    3.16 However, the most important difference was that the process used for the

    DWP Centric match was less strict than that used for matching against the DVLA

    and education databases. As a result matches against the DVLA, for example,

    which would have matched (at least partially) through the DWP process werenot counted as matched for DVLA. This has clear implications for the

    comparability of the results from the DWP match and the other databases.

  • 8/2/2019 Data Matching Pilot Evaluation

    40/159

    36

    4 Pilot authorities: overview

    and emerging issues4.1 This chapter sets out details of each of the pilot schemes in terms of the

    databases they accessed and the groups or areas they targeted. It goes on to

    consider some of the key issues identified by the local authorities in the delivery

    of the data matching pilots.

    Key points

    There were 22 data matching pilots testing various combinations ofdatabases.

    The complex nature and format of the data returned to local authorities

    highlighted the need for good data management and analysis skills. Many

    of the pilots either had these skills available within the local authority or

    recruited additional staff using Cabinet Office funding. However, several

    did not and struggled to use the information provided.

    The process, as tested in these pilots, was labour intensive with significant

    work required to analyse the data. Those involved felt that the level of work

    required would not be sustainable in the future.

    A number of pilot authorities were able to use locally-held data to

    interrogate the data received from the national databases. The results of

    this activity suggest that there is scope for more use to be made of local

    data both to complement any future national data matching and to

    improve accuracy and completeness in general.

    Overview4.2 Twenty two local authorities were selected by the Cabinet Office to take

    part in the data matching schemes. Table 2 provides the full list of each

    participating authority and which databases they planned to access. It also

    outlines whether or not they matched their full register or part of their register,

    and which groups they were targeting as part of the data matching scheme.

    4.3 These differences should be remembered when considering the results

    from each pilot. In addition, some local authorities opted to conduct a targeted

  • 8/2/2019 Data Matching Pilot Evaluation

    41/159

    37

    follow up either in specific areas or with specific groups, while others followed

    up with random sample of names from across their area. The results from these

    different exercises are not, therefore, always comparable.

    4.4 and results is provided in the

    profiles in Appendix A.

  • 8/2/2019 Data Matching Pilot Evaluation

    42/159

    38

    Table 2: Data matching pilots overviewLocal authority Database(s) requested Target groups AreaBlackpool DWP Centric, NPD, Royal Mail,

    ILR, HEFCE, DVLA

    Empty properties in low responding areas Six electoral wards

    Camden DWP Centric, ILR, SLC, NPD,

    HEFCE

    Students, young people and the mobile population Whole register

    Colchester DWP Centric, SLC, MoD,Royal Mail

    General under-registered and service personnel Whole register

    Forest Heath DWP Centric Young people and the mobile population Whole register

    Forest of Dean DWP Centric, DVLA, ILR, NPD,

    HEFCE, DVLA

    Attainers Whole register

    Glasgow DWP Centric, DVLA, SLC Students, young people and the mobile population Two electoral wards

    Greenwich DWP Centric, DVLA, ILR, NPD,

    HEFCE, MOD

    Young people, BME groups and those under-

    registered for financial reasons

    Whole register

    Lothian DWP Centric General under-registered Whole register

    Manchester DWP Centric, DVLA, SLC Empty properties, students and BME groups

    Newham DWP Centric Young people and the mobile population Whole register

    Peterborough DWP Centric Seasonal workers and those living in houses of

    multiple occupation (HMOs)

    One electoral ward

    Renfrewshire Citizen Account General under-registered Whole register

    Rushmoor MoD Service personnel Service voters list

    Shropshire MoD Service personnel Service voters list

    Southwark DWP Centric, Royal Mail General under-registered Three electoral wards

  • 8/2/2019 Data Matching Pilot Evaluation

    43/159

  • 8/2/2019 Data Matching Pilot Evaluation

    44/159

    40

    Emerging issues

    4.5 There are several key evaluation findings which relate to the different skills,

    capacity and experience of the local authorities involved in the pilots.

    Skills

    4.6 There was significant variation between the pilots in terms of the skills

    available within the local authority as a whole. Each pilot was generally led by

    electoral administrators, who used support available to them within their team,

    within the local authority or externally.

    4.7 The pilot authorities divided into three groups with regard to how they

    managed the data:

    Those who managed the pilot within their existing electoral services team

    and who had access to existing data management or IT expertise within

    the wider local authority

    Those who intended to manage the pilot within their existing electoral

    services team with no dedicated local authority data handling team

    Those who used pilot funding to recruit additional, temporary staff for the

    purposes of data analysis and data management

    4.8 Those authorities with data management support were able to interrogate

    the information provided to a much greater extent, while others were unable to

    do so and used the data as it was provided. In the absence of data analysis and

    -

    volumes of data but with some authorities receiving hundreds of thousands of

    lines of data there was a clear need to automate some of the process. In

    extreme cases the electoral services team were simply overwhelmed by the

    volume of data provided and in the absence of data management skills coulddo little with the information. For example, one area explained:

    Because of the unexpectedly large numbers of apparently probable new

    identities found in the data matching, and the quantity and difficulty of

    dealing with such large volumes of data, it was decided to limit the

    number of potential electors

  • 8/2/2019 Data Matching Pilot Evaluation

    45/159

    41

    4.9 Another area noted:

    Some of the technical issues relating to local management of received

    data, and conversion to a form that could be used for our purposes,

    needed considerable time to deal with, and suggests there is a need to

    develop a range of data management knowledge and skills not available

    within the current local electoral services team.

    4.10 In a couple of pilots, as a result of delays or a lack of skills and capacity,

    no follow-up activity was undertaken even where data matching identified

    people potentially eligible but not on the registers.

    4.11 Electoral administrators in the authorities with good data management

    support were also clear that they could not have coped with this volume of data

    in the absence of that support as they do not have the skills themselves.

    4.12 Although any future roll-out of data matching across the country would not

    follow the process used in these pilots, any process which requires electoral

    administrators to manipulate and analyse data would require a change in the

    skill sets of many electoral registration teams. The closer any data matching

    process gets to an automated provision of lists of potential new electors (which

    can be easily integrated into the software used to manage the registers), the

    smaller the required change will be.

    Capacity

    4.13 In addition to the skills required to make full use of this data there was a

    more general need for additional capacity within many teams. The volume of

    data provided meant that several areas could not do as much with the

    information as they had originally intended. This was exacerbated by the delays

    which pushed the process further into the canvass period.

    4.14 Several pilots raised concerns that in its current format the process of data

    matching was too labour intensive for regular use. Many also pointed out that

    they are currently facing significant cuts in budgets and as a result they can only

    see a future for data matching if it is able to improve registration with no

    additional, or a reduced, financial burden for the authority.

    4.15 This is significant not just because local authorities are unlikely to be

    expanding their electoral services teams in the near future but also because it

    calls into question the likelihood of significant resources being devoted to

  • 8/2/2019 Data Matching Pilot Evaluation

    46/159

    42

    training existing staff or providing new data management software without clear

    evidence that it may lead to cost savings later. For example, one area reported

    that:

    Currently there are too many records to make this a viable exercise with

    the resources available.

    Local data matching

    4.16 Related to the availability of relevant data management skills is the

    variation between the pilots in their existing use of locally-held data to assist with

    electoral registration. EROs have powers to inspect other data held by the local

    authority for the purposes of maintaining the register and the vast majority of

    EROs make some use of information, e.g. from council tax records. However,

    there is substantial variation in both the range of data accessed and the

    methods by which it is used.

    4.17 For example one of the pilots, Newham, has developed a system for use

    across the authority, which draws together information from sources including

    council tax, housing benefit, libraries and leisure centre records to create a

    searchable electronic database of residents in the borough. But another similar

    authority only regularly accesses council tax records provided in Excel

    spreadsheet format. The different starting points of these two pilot authorities

    coming into the pilot process meant that while the first could draw on theexpertise built up during the development of their in-house system, the latter

    needed

    4.18 This also meant that Newham could check the data provided through the

    pilot against the data they already held on their systems, gathered locally. As a

    result Newham only issued letters to names which were suggested by DWP

    Centric and could be corroborated by local data. As the data in Chapter 5indicates, this did not result in significantly higher registrations than other areas

    but, unlike several pilots, they received very few responses indicating that theperson written to was no longer resident.

    4.19 This evidence is not conclusive but it does suggest that there is significant

    scope for more use to be made of local data.

    Conclusion

    4.20 This chapter has provided an overview of the pilot schemes, setting out the

    databases they accessed and the groups or areas they targeted. It has alsoconsidered the key issues that cut across all the pilot areas the skills and

  • 8/2/2019 Data Matching Pilot Evaluation

    47/159

    43

    capacity of electoral services teams as well as the importance of good use of

    local data.

    4.21 The following chapters go on to consider the results of the matching

    exercise and follow up, in turn, for each national database.

  • 8/2/2019 Data Matching Pilot Evaluation

    48/159

  • 8/2/2019 Data Matching Pilot Evaluation

    49/159

    45

    The absence of a unique identifier attached to each address on the

    national databases was a key issue for the pilots. These would have

    allowed for a more straightforward matching process for local authorities.

    Many of the potential new electors suggested by the match with the DWPCentric database proved to be based on out-of-date or incorrect

    information. The problems posed by this could have been reduced by the

    inclusion of the date when the DWP record changed something which

    DWP were willing to provide but were not asked to do so.

    The absence of nationality information meant that several pilot authorities

    conducted follow up with people ineligible to register. The scope of this

    issue is not clear from these pilots and will have varied depending on the

    demographic nature of the local authority area.

    -legislative

    scrutiny of their policy on individual electoral registration, to use national

    data sources to verify the identity of electors has not been tested by these

    pilots. Further piloting would be required to ensure that the advantages

    and disadvantages of these proposals are understood.

    Matching results

    5.2 Eighteen of the 22 pilot areas accessed data from the DWP Centric

    database. The results of the matching process are presented in Table 3 and are

    based on the data supplied by local authorities.

    5.3 The results vary considerably across different local authorities and this is

    partly explained by the different approaches adopted by each pilot area (as set

    out in Chapter 2). For example, the Stratford pilot, with the highest reported level

    of match, focused its attention on attainers and the over-70s which made it

    more likely they would see a higher match level (as both groups are less likely to

    change address frequently).

    5.4 Also, as mentioned above, the match score at which a pilot accepted a

    result as a match varied significantly and this has led to greater variation in the

    data returned (see Chapter 3 for a full explanation of the match process and

    scores). For example, Peterborough appears to record a relatively low match

    level (54.7%) but they accepted only those records which scored 99 and above

    as a match. Wigan records a high match rate (82.4%) but accepted all records

    scoring 45 and above as a match. If a score of 65, between those two levels

  • 8/2/2019 Data Matching Pilot Evaluation

    50/159

    46

    was accepted as a match,

    5.5 Those pilots with the ability to fully interrogate the data found that the

    likelihood of a match did not necessarily increase with the score. For example,

    Colchester found that while those records that scored 65 were very likely to be

    real matches, there were many with scores above that which proved to be false

    positives, i.e. they received a high match score but, on checking, were found not

    to be true matches. Where a local authority had this the data

    received from DWP, the match rate cannot easily be compared like-for-like with

    thos .

    5.6 The data presented in Table 3 should therefore be treated with some

    caution. Nonetheless the results show:

    There is substantial variation across local authorities regarding the level of

    match between the electoral registers and the DWP Centric database

    ranging from 45.7% to 85.3%.

    In total, 1,925,336 register entries were sent for matching and 1,370,006

    were found on the DWP Centric database.15

    That equates to a match level of 71.2%.

    The percentage of register entries sent for matching but not found on DWP

    Centric varied across local authorities from 12.4% to 47.6%.

    The number of records foun