The National Archive Sensitivity Review by Tim Gollins

7
Project Abacá: Technically Assisted Sensitivity Review of Digital Records. Tim Gollins, Michael Moss, Craig Macdonald, Iadh Ounis, Norman Gray, Graham Mcdonald, James Girdwood, John Thompson(FCO) 20 th June 2014

description

IT as a Utility Network+ community conference 19-20 June 2014, Southampton (ITaaU Network+)

Transcript of The National Archive Sensitivity Review by Tim Gollins

Page 1: The National Archive Sensitivity Review by Tim Gollins

Project Abacá:Technically Assisted Sensitivity Review

of Digital Records.Tim Gollins, Michael Moss, Craig Macdonald, Iadh Ounis,

Norman Gray, Graham Mcdonald, James Girdwood, John Thompson(FCO)

20th June 2014

Page 2: The National Archive Sensitivity Review by Tim Gollins

04/08/23 2

Public Records Transfer &Sensitivity Review

TransferPermanent

Preservation Presentation

on-lineSelection

& AppraisalSensitivity

Review

Government Department The National Archives

Page 3: The National Archive Sensitivity Review by Tim Gollins

04/08/23 3

Full Project Concept & Outputs● Concept

– Assumption: No new resources– Risk Management not Risk Avoidance– Need to understand Risks and Costs explicitly

● Outputs– New method for review

● Manage risk & mange resource– New decision support tool (tech. demonstrator)

● Prioritise review & measure risks

Page 4: The National Archive Sensitivity Review by Tim Gollins

04/08/23 4

Proof of Concept – Success (1)● Test Collection

– Initial Corpus Built – More to do● Engagement With Reviewers (FCO)

– Confirmed Assumptions

– Inspired Tool Development (document Features)– Insights into nature of sensitivity and review – Workshop In July – do you want to come ?

Page 5: The National Archive Sensitivity Review by Tim Gollins

04/08/23 5

Proof of Concept – Success (2)● Sensitivity Classifier Tool

– Uses latest IR techniques (Learning to Rank etc.)– Uses a variety of document features – Proved that subject matter alone insufficient– Work on Twitter reputation management has

pointed the way to other features– Work with FCO also inspires new features

Page 6: The National Archive Sensitivity Review by Tim Gollins

Classification Results

FOIA Section 27 FOIA Section 40

Features Balanced Accuracy Features Balanced Accuracy

Text Classification 0.633 Text Classification 0.634

+ Country Count 0.645 + Country Count 0.641

+ Source 0.637 +DOB 0.639

+ All Verbs Count 0.635 +POB 0.638

+Media 0.635 +Media 0.637

Page 7: The National Archive Sensitivity Review by Tim Gollins

Questions ?

Come to the workshop – 17th 18th July

Email - [email protected]

Invitation Only