ACE: A Software Tool to Ensure the Integrity of Digital Archives

13
ACE: A Software Tool to Ensure the Integrity of ACE: A Software Tool to Ensure the Integrity of Digital Archives Digital Archives Principal Investigator: Joseph JaJa Graduate Student: Sangchul Song Lead Programmer: Michael Smorul University of Maryland, College Park

description

ACE: A Software Tool to Ensure the Integrity of Digital Archives. Principal Investigator: Joseph JaJa Graduate Student: Sangchul Song Lead Programmer: Michael Smorul University of Maryland, College Park. Using Hashes to Monitor Files. Strong hashes can assert a file has not changed - PowerPoint PPT Presentation

Transcript of ACE: A Software Tool to Ensure the Integrity of Digital Archives

Page 1: ACE: A Software Tool to Ensure the Integrity of Digital Archives

ACE: A Software Tool to Ensure the Integrity of ACE: A Software Tool to Ensure the Integrity of

Digital ArchivesDigital Archives

Principal Investigator: Joseph JaJaGraduate Student: Sangchul SongLead Programmer: Michael Smorul

University of Maryland, College Park

Page 2: ACE: A Software Tool to Ensure the Integrity of Digital Archives

September 2009 GeoMapp 2

Using Hashes to Monitor Using Hashes to Monitor FilesFiles

• Strong hashes can assert a file has not changed

• How to manage millions of hashes?• How do you prove the hash value hasn’t changed?

• How do you prove a hash value was issued at a given time?

Page 3: ACE: A Software Tool to Ensure the Integrity of Digital Archives

September 2009 GeoMapp 3

Audit Control Environment Audit Control Environment (ACE)(ACE)

• Solves the problem of storing and verifying hashes.

• Secures hashes by issuing token for each file/hash to me monitored.

• Tokens contain a cryptographic proof that allows for 3rd party auditing.

• One number stored externally is used to audit tokens and hashes.

Page 4: ACE: A Software Tool to Ensure the Integrity of Digital Archives

September 2009 GeoMapp 4

Hash AuthenticationHash Authentication

Hash 1 Hash 6Hash 5Hash 3 Hash 4Hash 2

Intermediate Hash ValuePrevious Round Hash

CSI (one hash value)

ChallengeHash

IHV

IHV

Gather Hashes During RoundCreate Merkel Tree For Supplied HashesLink to previous roundGenerate proof for hash

Page 5: ACE: A Software Tool to Ensure the Integrity of Digital Archives

September 2009 GeoMapp 5

Token SampleToken Sample<token>

<token-class>SHA-256-0</token-class><digest-service>SHA-256</digest-service><name>/SRB3_2_1f.tar</name><round-id>1223</round-id><time-stamp>2008-07-22T11:03:45.059-0400</time-stamp><proof>

<element index="0">

<hash>2e869e2ce41ede3ceb3af50f8aa2705067b3e67055b5b3d2787e2c294a95a869</hash></element><element index="0">

<hash>6a925501991d7b4ff660d499416fd45a20dde161eb68e59fedc0f58208ad86cf</hash></element><element index="0">

<hash>134432a6a6527162d24e99435e817511eeb89ddc03afbc6a30f23e404847cc06</hash></element><element index="1">

<hash>1aeaf2d76976cf9759b0d63bc7acdf9c6df68875bfc9bcc0e22c19401aab0133</hash></element></proof>

</token>

Page 6: ACE: A Software Tool to Ensure the Integrity of Digital Archives

September 2009 GeoMapp 6

How to scale?How to scale?• Two layers of Merkel tree

1. Short rounds(seconds), that generate Cryptographic Summary Information(CSI).

2. Each successive round includes previous CSI

3. Second, daily rounds comprised of all CSI’s for previous day.

• Daily tree root, called Witness can validate all CSI’s for a day.– Only 365/year generated. Very manageable!

• Two components, an Integrity Management Service(IMS), and Audit Manager(AM) were developed.

Page 7: ACE: A Software Tool to Ensure the Integrity of Digital Archives

September 2009 GeoMapp 7

ComponentsComponents

• Integrity Management Service (remote)– Runs all hash aggregation, round generation, witness publication.

– Stores CSI values– Generate proofs from CSI to witness– ims.umiacs.umd.edu

• Audit Manager (local)– Monitors local files– Determines audit policy– One or more per archive– Locally stores hashes and tokens

Page 8: ACE: A Software Tool to Ensure the Integrity of Digital Archives

September 2009 GeoMapp 8

ACE – System ArchitectureACE – System Architecture

reply

Token Registry

hdd

Archiving Node

cd-romtape drive

request

ACE Audit Manager

Third-Party Integrity Management System

CryptoSummary

Information

reply

Token Registry

hdd

Archiving Node

cd-romtape drive

request

ACE Audit Manager

witnesses witnesses

Audit Policy Audit Policy

Page 9: ACE: A Software Tool to Ensure the Integrity of Digital Archives

September 2009 GeoMapp 9

ACE AuditACE Audit

Audit Local Files: Audit Manager periodically scans all files and compares stored digests with computed digests. Assume valid hashes in database

Audit Local Manager: Manager computes round summary for each digest using that digest and its token. This is compared to value stored on the IMS. Assume IMS returns valid summary information, do not

trust hashes in database

IMS Audit: Round summaries are used to compute witness values. These are compared with offsite witness values. Do not trust IMS, force IMS to prove its CSIs link to a

witness

Page 10: ACE: A Software Tool to Ensure the Integrity of Digital Archives

September 2009 GeoMapp 10

Audit ManagerAudit Manager

• Downloadable, one or more per archive• Monitors local files• Simple Requirements

– Java 1.6+– Tomcat– MySQL

• Managed by archivist/librarian after install

• Monitor multiple collections on different architectures

• Hides all the complexity you just saw!

Page 11: ACE: A Software Tool to Ensure the Integrity of Digital Archives

September 2009 GeoMapp 11

PerformancePerformance

• Audit Manager (1.1beta3)• 1.25 million false hashes (no bytes read)

– Registration: 3h, 6m (112 files/s)– Audit: 1h, 15m (277 files/s)

• 1.25 million false data files (1.25Tb data)– Registration: 5h, 7m (67.8 files/s, 67.8MBytes/s)

– Audit: 4h, 30m (77.2 files/s, 77.2MBytes/s)

• In practice, bottleneck tends to occur at archival resource, not AM.

• Chronopolis– 5.5m files, over 20Tb in size

Page 12: ACE: A Software Tool to Ensure the Integrity of Digital Archives

September 2009 GeoMapp 12

Future DirectionsFuture Directions

• Statistical sampling– Low-rate auditing, probability of error detection

• Cloud auditing– Data transfer costs $$$– Is the cloud lying?

• Additional Storage Support– Web, ftp, smb

Page 13: ACE: A Software Tool to Ensure the Integrity of Digital Archives

September 2009 GeoMapp 13

ACE SummaryACE Summary

• Third-party auditable• Cryptographically rigorous yet cost-effective

• Scalable, High Performance• Current Efforts

– Provide public IMS– Create simple audit manager for local use

• http://adapt.umiacs.umd.edu/ace