Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

33
Not All Mementos are Created Equal: Measuring the Impact of Missing Resources Justin F. Brunelle, Mat Kelly, Hany SalahEldeen, Michele C. Weigle, Michael L. Nelson Old Dominion University {jbrunelle, mkelly, hany, mweigle, mln}@cs.odu.edu 1

description

Slides presented by Justin F. Brunelle at Digital Preservation 2014 in London.

Transcript of Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

Page 1: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

Not All Mementos are Created Equal: Measuring the Impact of Missing

ResourcesJustin F. Brunelle, Mat Kelly, Hany SalahEldeen, Michele C. Weigle,

Michael L. Nelson

Old Dominion University

{jbrunelle, mkelly, hany, mweigle, mln}@cs.odu.edu

1

Page 2: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

Goal: Automatically measure the quality of the archives

2

20% missing

Page 3: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

Goal: Automatically measure the quality of the archives

3

14% missing

Page 4: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

Goal: Automatically measure the quality of the archives

4

28% missing

Page 5: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

Goal: Automatically measure the quality of the archives

5

7% missing

Page 6: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

“Live” XKCD

• Missing 17% of embedded resources

• Looks complete

6

Page 7: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

“Live” XKCD

• Take three resources:• Logo

• Main Comic

• Navigation Strip

• Relative importance?

• All present in “Live” XKCD

7

Page 8: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

Damaging XKCD

• Created a local memento

• Removed the logo and navigation strip

• Now missing 29% of embedded resources

• Human assessment: looks OK

8

Page 9: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

Damaging XKCD

• From our local memento

• Removed the Main Comic

• Now missing 24% of embedded resources

• Human assessment: Not a usable memento

9

Page 10: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

Damaging XKCD

• From our local memento

• Removed the Main Comic

• Now missing 24% of embedded resources

• Human assessment: Not a usable memento

• Percent of missing embedded resources is not a suitable metric for memento quality

10

Page 11: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

Image Importance

• Size (as percentage of all pixels)

11

Page 12: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

Image Importance

• Size

• Position (in viewport?)

12

Page 13: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

Image Importance

• Size

• Position

• Centrality (in the vertical or horizontal center?)

13

Page 14: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

Missing CSS

• Damage not limited to images

• When missing CSS, content shifts left

14

Page 15: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

Missing CSS

• Partitioned snapshot into thirds

• Background color determined

• Pixel-by-pixel comparison

15

Page 16: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

Missing CSS

• Calculated the amount of content in each vertical third

• If >=80% in left column and missing CSS, CSS is important

• Only performed if stylesheets are missing

16

Page 17: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

Percent Missing vs. Weighted Damage

• 𝑀𝑀 = Percent of embedded resources missing

𝑀𝑀 =𝐸𝑚𝑏𝑒𝑑𝑑𝑒𝑑 𝑅𝑒𝑠𝑜𝑢𝑟𝑐𝑒𝑠 𝑀𝑖𝑠𝑠𝑖𝑛𝑔

𝑇𝑜𝑡𝑎𝑙 𝐸𝑚𝑏𝑒𝑑𝑑𝑒𝑑 𝑅𝑒𝑠𝑜𝑢𝑟𝑐𝑒𝑠

• 𝐷𝑀 = Damage rating of missing embedded resources

𝐷𝑀 =𝐷𝑀𝐴𝑐𝑡𝑢𝑎𝑙𝐷𝑀𝑃𝑜𝑡𝑒𝑛𝑡𝑖𝑎𝑙

𝐷𝑀𝑃𝑜𝑡𝑒𝑛𝑡𝑖𝑎𝑙 = 𝑖=1

𝑛[𝐼|𝑀𝑀]𝐷[𝐼|𝑀𝑀] (𝑖)

𝑛[𝐼|𝑀𝑀]+ 𝑖=1

𝑛[𝐶]𝐷[𝐶] (𝑖)

𝑛𝐶 17

𝐼 = 𝐼𝑚𝑎𝑔𝑒

𝑀𝑀 = 𝑀𝑢𝑙𝑡𝑖𝑀𝑒𝑑𝑖𝑎

𝐶 = 𝐶𝑆𝑆

Page 18: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

Calculated Damage

• 𝑀𝑀 = Percent of embedded resources missing

𝑀𝑀 =𝐸𝑚𝑏𝑒𝑑𝑑𝑒𝑑 𝑅𝑒𝑠𝑜𝑢𝑟𝑐𝑒𝑠 𝑀𝑖𝑠𝑠𝑖𝑛𝑔

𝑇𝑜𝑡𝑎𝑙 𝐸𝑚𝑏𝑒𝑑𝑑𝑒𝑑 𝑅𝑒𝑠𝑜𝑢𝑟𝑐𝑒𝑠

• 𝐷𝑀 = Damage rating of missing embedded resources

𝐷𝑀 =𝐷𝑀𝐴𝑐𝑡𝑢𝑎𝑙𝐷𝑀𝑃𝑜𝑡𝑒𝑛𝑡𝑖𝑎𝑙

𝐷𝑀𝑃𝑜𝑡𝑒𝑛𝑡𝑖𝑎𝑙 = 𝑖=1

𝑛[𝐼|𝑀𝑀]𝐷[𝐼|𝑀𝑀] (𝑖)

𝑛[𝐼|𝑀𝑀]+ 𝑖=1

𝑛[𝐶]𝐷[𝐶] (𝑖)

𝑛𝐶 18

𝑀𝑀 = 0.29𝐷𝑀 = 0.36

𝑀𝑀 = 0.24𝐷𝑀 = 0.41

Page 19: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

What do Web users think?

19

Page 20: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

Setting up the Turk Test

• Amazon’s mechanical turkers represent real web users

• Two legs of the experiment:• Manually damaged memento vs. Live resource

• 10 manually damaged mementos and resources

• Real Memento vs. Real Memento• 100 URI-Rs, one memento per year

20

Page 21: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

21

Page 22: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

22

Page 23: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

23

Page 24: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

Quantifying Turker Response

• 5 turkers for each comparison

• Assume 𝐷𝐴 < 𝐷𝐵 (i.e., A is less damaged)

• Measure turker agreement:

Image A Image B Split

Turker 1 Y

Turker 2 Y

Turker 3 Y

Turker 4 Y

Turker 5 Y

Result 5 0 5-024

Page 25: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

Quantifying Turker Response

• 5 turkers for each comparison

• Assume 𝐷𝐴 < 𝐷𝐵 (i.e., A is less damaged)

• Measure turker agreement:

Image A Image B Split

Turker 1 Y

Turker 2 Y

Turker 3 Y

Turker 4 Y

Turker 5 Y

Result 4 1 4-125

Page 26: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

Quantifying Turker Response

• 5 turkers for each comparison

• Assume 𝐷𝐴 < 𝐷𝐵 (i.e., A is less damaged)

• Measure turker agreement:

Image A Image B Split

Turker 1 Y

Turker 2 Y

Turker 3 Y

Turker 4 Y

Turker 5 Y

Result 0 5 0-526

Page 27: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

Quantifying Turker Response

• 5 turkers for each comparison

• Assume 𝐷𝐴 < 𝐷𝐵 (i.e., A is less damaged)

• Measure turker agreement:

Image A Image B Split

Turker 1 Y

Turker 2 Y

Turker 3 Y

Turker 4 Y

Turker 5 Y

Result 0 5 0-527

No agreement!

Page 28: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

Quantifying Turker Response

• 5 turkers for each comparison

• Assume 𝐷𝐴 < 𝐷𝐵 (i.e., A is less damaged)

• Measure turker agreement:

Image A Image B Split

Turker 1 Y

Turker 2 Y

Turker 3 Y

Turker 4 Y

Turker 5 Y

Result 3 2 3-228

Page 29: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

Quantifying Turker Response

• 5 turkers for each comparison

• Assume 𝐷𝐴 < 𝐷𝐵 (i.e., A is less damaged)

• Measure turker agreement:Defined only by 4-1 and 5-0 splits

Image A Image B Split

Turker 1 Y

Turker 2 Y

Turker 3 Y

Turker 4 Y

Turker 5 Y

Result 3 2 3-229

Split decision No agreement!

Page 30: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

Turk Results

• Compared damage(𝐷𝑀) and percent missing (𝑀𝑀)• M0: Manually damaged mementos

• D: Internet Archive Mementos

• M: Percent missing in Internet Archive Mementos

• 𝐷𝑀vs. Live: 78.9% true positives

• 𝑀𝑀 vs. Live: 47.2% true positives• Worse than a 50/50 chance!

• 𝐷𝑀 vs 𝐷𝑀: 58.4% true positives

30

Page 31: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

Damage in the Internet Archive

• 1,000 URI-Rs from Bitly

• 1,000 URI-Rs from Archive-it

• Remove non-HTML representations

• 1,861 URI-Rs remaining

• Sample 1 memento per year from Internet Archive

• Measure damage

31

Page 32: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

• Measured Internet Archive mementos

• Damage generally improves over time

• Despite missing more resources over time

Damage in the Internet Archive

32

Page 33: Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

Conclusions

• 𝐷𝑀 is a better measure of memento quality than 𝑀𝑀• On average, the Internet Archive is improving its quality over time

• Internet Archive is also missing more embedded resources over time

• Improved damage weighting (58.4% correct can be improved)

• Measure cumulative temporal damage ratings• E.g., a logo that never changes for 10 years and is used by 100 mementos is

more important than the one used in a single memento.

33