2003 Swiss Federal Archives 1 Archiving Snapshots or Transactions Extracting the right data at the...

18
2003 Swiss Federal Archives 1 Archiving Snapshots or Transactions Extracting the right data at the right time from temporal databases Niklaus Bütikofer, Swiss Federal Archives 10 April 2003

description

2003 Swiss Federal Archives 3 Snapshot and valid-time Databases PERS_IDADDRESS 215Sonnegg Thunstrasse 1 PERS_IDADDRESSFROM_DATETO_DATE 215Rathausgasse Sonnegg „now“ 309Kramgasse Belpstrasse Thunstrasse „now“ Snapshot Database Valid-time Database

Transcript of 2003 Swiss Federal Archives 1 Archiving Snapshots or Transactions Extracting the right data at the...

Page 1: 2003 Swiss Federal Archives 1 Archiving Snapshots or Transactions Extracting the right data at the right…

2003

Swiss Federal Archives 1

Archiving Snapshots or TransactionsExtracting the right data at the right

time from temporal databases

Niklaus Bütikofer, Swiss Federal Archives10 April 2003

Page 2: 2003 Swiss Federal Archives 1 Archiving Snapshots or Transactions Extracting the right data at the right…

2003

Swiss Federal Archives 2

Types of Databases•Database with additions only•Database with amendments

(„dynamic databases“)o Snapshot databaseo Temporal database

- Valid-time database- Transaction-time database- Bitemporal database

o Mixed snapshot and temporal database

Page 3: 2003 Swiss Federal Archives 1 Archiving Snapshots or Transactions Extracting the right data at the right…

2003

Swiss Federal Archives 3

Snapshot and valid-time DatabasesPERS_ID ADDRESS215 Sonnegg 11309 Thunstrasse 1

PERS_ID

ADDRESS FROM_DATE TO_DATE

215 Rathausgasse 6

1990-08-01 1995-09-30

215 Sonnegg 11 1995-10-01 „now“309 Kramgasse 9 1990-08-01 1994-12-

31309 Belpstrasse

231995-01-01 1998-04-

30309 Thunstrasse

11998-05-01 „now“

Snapshot Database

Valid-time Database

Page 4: 2003 Swiss Federal Archives 1 Archiving Snapshots or Transactions Extracting the right data at the right…

2003

Swiss Federal Archives 4

Archiving snapshot databases

•Snapshots do not tell us when changes occurred•Certain facts can completely disappear

time

Snapshot 1

Snapshot 2

Rathausgasse 6

t1 t2

Sonnegg 11

Kramgasse 9

Thunstrasse 1

Belpstrasse 23

215

309

PERS_ID

ADDRESS

Page 5: 2003 Swiss Federal Archives 1 Archiving Snapshots or Transactions Extracting the right data at the right…

2003

Swiss Federal Archives 5

Archiving logfiles ?

•DBMS record all transactions in systemlogs (journals)

•Purpose: recovery and auditing

time

Logfile 1 Logfile 2 L 3

Roll forward

Roll backward

Snapshot 1

Snapshot 2

t1 t2

Page 6: 2003 Swiss Federal Archives 1 Archiving Snapshots or Transactions Extracting the right data at the right…

2003

Swiss Federal Archives 6

Using logfiles for archives•Even if standard logfiles are not binary, but

SQL statements written in ASCII, they depend on how SQL is implemented in a given system.

•Standard logfiles can only be used for automatic roll back or roll forward in their original system.

•Standard logfiles in archives are only useable for „manual“ verification of single facts.

•How good would standard SQL logfiles work?

Page 7: 2003 Swiss Federal Archives 1 Archiving Snapshots or Transactions Extracting the right data at the right…

2003

Swiss Federal Archives 7

Archiving temporal databases• Temporal databases contain the complete

history of valid-states and/or transactions• For purposes of performance and/or

compliance (with e.g. privacy regulations) data must be periodically deleted resp. archived.

• Solutions:1. Archive all rows (tupels), that are non-current at a

given time (all rows/tupels with TO_DATE before YYYY-MM-DD) and delete them in the database afterwards.

2. Archiving snapshots combined with delete procedure

Page 8: 2003 Swiss Federal Archives 1 Archiving Snapshots or Transactions Extracting the right data at the right…

2003

Swiss Federal Archives 8

Archiving temporal databases (1)

time

Archive all rows (tupels) with

TO_DATE before YYYY-MM-DD

Rathausgasse 6

t1

Sonnegg 11Kramgasse

9Thunstrasse

1Belpstrasse

23

215309

PERS_ID

ADDRESS

547 Archivstrasse 24

•In archived package no complete time-slice possible

Page 9: 2003 Swiss Federal Archives 1 Archiving Snapshots or Transactions Extracting the right data at the right…

2003

Swiss Federal Archives 9

time

Snapshot 1

Snapshot 2

Rathausgasse 6

t1 t2

Sonnegg 11

Kramgasse 9

Thunstrasse 1

Belpstrasse 23

215

309

PERS_ID

ADDRESS

547 Archivstrasse 24

Archiving temporal databases (2)

Page 10: 2003 Swiss Federal Archives 1 Archiving Snapshots or Transactions Extracting the right data at the right…

2003

Swiss Federal Archives 10

PERS_ID

ADDRESS FROM_DATE TO_DATE

215 Rathausgasse 6

1990-08-01 1995-09-30

215 Sonnegg 11 1995-10-01 „now“309 Kramgasse 9 1990-08-01 1994-12-

31309 Belpstrasse

231995-01-01 1998-04-

30309 Thunstrasse 1 1998-05-01 „now“547 Archivstrasse

241990-08-01 „now“

Valid-time Database

Archiving snapshots from temporal databases (II)

S1 / delS1S1 / delS1

S1S1 = Snapshot 1

Page 11: 2003 Swiss Federal Archives 1 Archiving Snapshots or Transactions Extracting the right data at the right…

2003

Swiss Federal Archives 11

When should snapshots be extracted?

•Snapshot databases:o The frequency of snapshots is dependant on the

frequency of data modifications and deletionso When legal or business requirements

necessitate major deletions•Temporal databases:

o When appropriate archival „packages“ are together (size, time period covered)

o Before major schema changeso When legal or business requirements

necessitate major deletions

Page 12: 2003 Swiss Federal Archives 1 Archiving Snapshots or Transactions Extracting the right data at the right…

2003

Swiss Federal Archives 12

Mixed snapshot and temporal database

Master-Data

Business transaction 1

1995-03-051996-07-23

snapshot database

temporal or pseudo-temporal database

• Often archiving and deletion time must be compliant with legal and business requirements

e.g. PERSONBusiness transaction

2 1996-01-101998-12-03Business transaction

3 1997-05-21

„now“

Page 13: 2003 Swiss Federal Archives 1 Archiving Snapshots or Transactions Extracting the right data at the right…

2003

Swiss Federal Archives 13

Archiving mixed snapshot and temporal databases

Master-Data

snapshot database

temporal or pseudo-temporal database

Completely archived 2001-01-01

Snapshot archived 2001-01-01Master-Data

Completely archived 2003-01-01

Snapshot archived 2003-01-01

Business transaction 1

1995-03-051996-07-23

Business transaction 2

1996-01-101998-12-03

Page 14: 2003 Swiss Federal Archives 1 Archiving Snapshots or Transactions Extracting the right data at the right…

2003

Swiss Federal Archives 14

time

Action 1

t1

Action 2

•In archived package no complete time-slice possible

•Schema changes may prevent backward assembly of archived elements

Archiving mixed snapshot and temporal databases (1)

Action 3

AA

A

A = archiving/deletion time

Page 15: 2003 Swiss Federal Archives 1 Archiving Snapshots or Transactions Extracting the right data at the right…

2003

Swiss Federal Archives 15

time

Business transaction 1

t1

•Snapshots allow synchronous research only for the point in time when the snapshot has been taken.

•Synchronous and diachronous research require snapshots and current archiving of action data “packages”.

Archiving mixed snapshot and temporal databases (2)

Business transaction 3

DD

D

D = required deletion time

Business transaction 2

Snapshot 1

Page 16: 2003 Swiss Federal Archives 1 Archiving Snapshots or Transactions Extracting the right data at the right…

2003

Swiss Federal Archives 16

Conclusions (1)•Temporal databases are best suited

for archiving. Archived snapshots allow synchronous and diachronous research, but queries may become complex.

•For other databases neither snapshots nor current archiving of (database or business) transactions can fully satisfy all use requirements.

Page 17: 2003 Swiss Federal Archives 1 Archiving Snapshots or Transactions Extracting the right data at the right…

2003

Swiss Federal Archives 17

Conclusions (2)•Archivists or preservers need to involve

themselves in the design process of databases in order to get the archival function appropriately implemented. Good implementation may be:o Build fully temporal databases.o Build in triggers that write all modifications

of the database to an archival store which is mirroring the current database as a kind of temporal database.

Page 18: 2003 Swiss Federal Archives 1 Archiving Snapshots or Transactions Extracting the right data at the right…

2003

Swiss Federal Archives 18

Open questions•How to deal with schema changes?•How to deal with partial

snapshots?Or, how to deal with referenced data which is not in the snapshot or in the archival „package“?