Bendavid unpacking archival_silences_guest_lecture_18022013

Post on 25-Jan-2015

1.027 views 2 download

description

 

Transcript of Bendavid unpacking archival_silences_guest_lecture_18022013

Unpacking  Archival  Silences

A  short  history  of  Web  archives  research

Anat  Ben-­‐david,  University  of  Amsterdam,  February  2013

Image: Luc Viatour / www.Lucnix.be

Monday, February 18, 13

What  are  Web  Archives  For?

Monday, February 18, 13

 1.  Preservation  of    (national)  digital  cultural  heritage

Monday, February 18, 13

 1.  Preservation  of    (national)  digital  cultural  heritage

-­‐ .."web  resources  which  are  collected  with  the  aim  of  their  long-­‐term  preservation".  (Czech  Web  archive)

Monday, February 18, 13

 1.  Preservation  of    (national)  digital  cultural  heritage

-­‐ .."web  resources  which  are  collected  with  the  aim  of  their  long-­‐term  preservation".  (Czech  Web  archive)

-­‐ "The  Archive's  mission  is  gathering  and  long-­‐term  preservation  of  Internet  publications  as  part  of  the  Croatian  national  heritage”  (Croatian  Web  archive)

Monday, February 18, 13

 1.  Preservation  of    (national)  digital  cultural  heritage

-­‐ .."web  resources  which  are  collected  with  the  aim  of  their  long-­‐term  preservation".  (Czech  Web  archive)

-­‐ "The  Archive's  mission  is  gathering  and  long-­‐term  preservation  of  Internet  publications  as  part  of  the  Croatian  national  heritage”  (Croatian  Web  archive)

-­‐ "..these  websites  were  carefully  selected  to  be  part  of  the  nation's  documentary  heritage".  (Singapore  Web  Archive)

Monday, February 18, 13

 2.  Responding  to  a  preservation  risk

Monday, February 18, 13

 2.  Responding  to  a  preservation  risk

-­‐ .."the  present  generation  may  be  considered  as  a  forgotten  dark  age  by  future  generations  if  we  neglect  to  select  and  preserve  digital  resources  at  country  level"(South  Korea  Web  archive)

Monday, February 18, 13

 2.  Responding  to  a  preservation  risk

-­‐ .."the  present  generation  may  be  considered  as  a  forgotten  dark  age  by  future  generations  if  we  neglect  to  select  and  preserve  digital  resources  at  country  level"(South  Korea  Web  archive)

-­‐ .."These  days,  documents  are  increasingly  being  published  only  digitally.  If  we  do  not  preserve  the  information,  part  of  our  heritage  will  be  lost  forever"  (Swedish  Web  archive)

Monday, February 18, 13

 2.  Responding  to  a  preservation  risk

-­‐ .."the  present  generation  may  be  considered  as  a  forgotten  dark  age  by  future  generations  if  we  neglect  to  select  and  preserve  digital  resources  at  country  level"(South  Korea  Web  archive)

-­‐ .."These  days,  documents  are  increasingly  being  published  only  digitally.  If  we  do  not  preserve  the  information,  part  of  our  heritage  will  be  lost  forever"  (Swedish  Web  archive)

-­‐ .."Responding  to  the  challenge  of  a  potential  ‘digital  black  hole’  the  UK  Web  Archive  is  there  to  safeguard  as  many  of  these  websites  as  practical.(UK  Web  Archive)

Monday, February 18, 13

 3.  Viewing  past  versions  of  a  Website

Monday, February 18, 13

 3.  Viewing  past  versions  of  a  Website

-­‐ .."You  can  see  archived  websites  in  their  original  version.  Our  service  will  help  you  search  efWiciently  and  quickly  for  an  important  publication  in  the  Wlood  of  information  on  the  Internet"  (Japan  Web  archive)

Monday, February 18, 13

 3.  Viewing  past  versions  of  a  Website

-­‐ .."You  can  see  archived  websites  in  their  original  version.  Our  service  will  help  you  search  efWiciently  and  quickly  for  an  important  publication  in  the  Wlood  of  information  on  the  Internet"  (Japan  Web  archive)

-­‐ .."The  collection  also  provides  a  visual  history  of  how  websites  change  over  time"  (New  Zealand  Web  archive)

Monday, February 18, 13

 3.  Viewing  past  versions  of  a  Website

-­‐ .."You  can  see  archived  websites  in  their  original  version.  Our  service  will  help  you  search  efWiciently  and  quickly  for  an  important  publication  in  the  Wlood  of  information  on  the  Internet"  (Japan  Web  archive)

-­‐ .."The  collection  also  provides  a  visual  history  of  how  websites  change  over  time"  (New  Zealand  Web  archive)

-­‐ .."Warning  -­‐  The  current  version  of  the  site  may  no  longer  be  available"  (Latvian  Web  Archive)

Monday, February 18, 13

 4.  and..  also  for  research

Monday, February 18, 13

 4.  and..  also  for  research

-­‐ .."This  makes  the  web  an  important  source  for  future  researchers,  not  only  for  studies  of  the  development  of  the  web  but  certainly  for  research  on  society  today"  (Dutch  Web  archive)

Monday, February 18, 13

 4.  and..  also  for  research

-­‐ .."This  makes  the  web  an  important  source  for  future  researchers,  not  only  for  studies  of  the  development  of  the  web  but  certainly  for  research  on  society  today"  (Dutch  Web  archive)

-­‐ .."All  materials  are  archived  and  available  for  use  by  researchers  and  others  who  need  them  in  their  studies  -­‐  now  and  in  the  future".  (Finland  Web  archive)

Monday, February 18, 13

 4.  and..  also  for  research

-­‐ .."This  makes  the  web  an  important  source  for  future  researchers,  not  only  for  studies  of  the  development  of  the  web  but  certainly  for  research  on  society  today"  (Dutch  Web  archive)

-­‐ .."All  materials  are  archived  and  available  for  use  by  researchers  and  others  who  need  them  in  their  studies  -­‐  now  and  in  the  future".  (Finland  Web  archive)

-­‐ .."Web  history  can  provide  a  tremendous  base  for  time-­‐based  analysis  of  the  content,  the  topology  including  emerging  communities  and  topics,  trends  analysis  etc.  as  well  as  an  invaluable  source  of  information  for  the  future"  (European  Archive)

Monday, February 18, 13

“Archival  Silences”  (?)

Image source: http://static.guim.co.uk/sys-images/Books/Pix/pictures/2009/10/16/1255686935351/Dusty-bookshelf-001.jpg

Monday, February 18, 13

“Archival  Silences”  (?)

-­‐ “Web  archives  will  be  the  digital  equivalent  of  the  dusty  archive,  often  well-­‐curated  and  maintained,  but  hardly  used”                -­‐-­‐  (Meyer  et  al.,  2011,  p.  7)

Image source: http://static.guim.co.uk/sys-images/Books/Pix/pictures/2009/10/16/1255686935351/Dusty-bookshelf-001.jpg

Monday, February 18, 13

“Archival  Silences”  (?)

-­‐ “Web  archives  will  be  the  digital  equivalent  of  the  dusty  archive,  often  well-­‐curated  and  maintained,  but  hardly  used”                -­‐-­‐  (Meyer  et  al.,  2011,  p.  7)

-­‐ “One  must  ask,  in  the  world  of  Internet  research,  why  do  Web  archives  appear  to  be  second  class  citizens?  “          -­‐-­‐      (Meyer  et  al.,  2011,  p.  9  )  

Image source: http://static.guim.co.uk/sys-images/Books/Pix/pictures/2009/10/16/1255686935351/Dusty-bookshelf-001.jpg

Monday, February 18, 13

“Archival  Silences”  (?)

-­‐ “Web  archives  will  be  the  digital  equivalent  of  the  dusty  archive,  often  well-­‐curated  and  maintained,  but  hardly  used”                -­‐-­‐  (Meyer  et  al.,  2011,  p.  7)

-­‐ “One  must  ask,  in  the  world  of  Internet  research,  why  do  Web  archives  appear  to  be  second  class  citizens?  “          -­‐-­‐      (Meyer  et  al.,  2011,  p.  9  )  

-­‐ “Web  archiving  infrastructure  receives  scholarly  and  non-­‐scholarly  attention;  the  archived  materials  –  the  primary  source  material  –  gain  less  notice”    -­‐-­‐    (Rogers  2013,  p.  85)

Image source: http://static.guim.co.uk/sys-images/Books/Pix/pictures/2009/10/16/1255686935351/Dusty-bookshelf-001.jpg

Monday, February 18, 13

“Archival  Silences”  (?)

-­‐ “Web  archives  will  be  the  digital  equivalent  of  the  dusty  archive,  often  well-­‐curated  and  maintained,  but  hardly  used”                -­‐-­‐  (Meyer  et  al.,  2011,  p.  7)

-­‐ “One  must  ask,  in  the  world  of  Internet  research,  why  do  Web  archives  appear  to  be  second  class  citizens?  “          -­‐-­‐      (Meyer  et  al.,  2011,  p.  9  )  

-­‐ “Web  archiving  infrastructure  receives  scholarly  and  non-­‐scholarly  attention;  the  archived  materials  –  the  primary  source  material  –  gain  less  notice”    -­‐-­‐    (Rogers  2013,  p.  85)

-­‐ “There  is  a  growing  gulf  in  web  archiving  between  the  researchers  who  want  to  use  web  artifacts  to  study  in  their  Wield  and  the  information  professional  who  serve  information  needs”      -­‐-­‐  (Dougherty  &  Heuvel  2010,  p.  6)

Image source: http://static.guim.co.uk/sys-images/Books/Pix/pictures/2009/10/16/1255686935351/Dusty-bookshelf-001.jpg

Monday, February 18, 13

A  short  history  of  Web  archives

Monday, February 18, 13

A  short  history  of  Web  archives

-­‐  1996-­‐1998  Web  archive  as  a  Web  index

Monday, February 18, 13

A  short  history  of  Web  archives

-­‐  1996-­‐1998  Web  archive  as  a  Web  index

-­‐ 1999-­‐  Web  archives  as  special  collections

Monday, February 18, 13

A  short  history  of  Web  archives

-­‐  1996-­‐1998  Web  archive  as  a  Web  index

-­‐ 1999-­‐  Web  archives  as  special  collections

-­‐ 2000-­‐The  national  turn  in  Web  archiving

Monday, February 18, 13

A  short  history  of  Web  archives

-­‐  1996-­‐1998  Web  archive  as  a  Web  index

-­‐ 1999-­‐  Web  archives  as  special  collections

-­‐ 2000-­‐The  national  turn  in  Web  archiving

-­‐ 2005  -­‐  Emerging  Web  archiving  theory

Monday, February 18, 13

-­‐ 1996-­‐  the  Internet  Archive  and  the  Wayback  Machine

-­‐ Crawlers  as  the  ultimate  collection-­‐makers  of  the  Web

-­‐ Navigational  tool  -­‐  together  with  the  Alexa  Toolbar,  providing  solution  to  accessing  broken  links  

-­‐ Organizational  tool  -­‐  borrowing  from  Library  Science  and  Scientometrics

-­‐ Web  archive  as  a  digital  library

 1.  Web  Archive  as  a  Web  Index

Image: http://www.wired.com/images_blogs/threatlevel/images/2008/05/07/brewster_kahle_630x.jpg

Monday, February 18, 13

Alexa Toolbar

Internet Archive Wayback Machine

Monday, February 18, 13

Monday, February 18, 13

 2.  Web  Archives  as  Special  Collections

• Foot  and  Schneider  1999  -­‐  “Web  Sphere  Analysis”

• Collections  of  elections,  natural  disasters  and  “transitions”  continue  to  dominate  the  Wield

• Content  and  hyperlink  analysis  

Monday, February 18, 13

Monday, February 18, 13

3.  The  national  turn  in  Web  archiving

Web  archiving  at  a  national  scale  proposes  new  questions  and  challenges:

-­‐ What  is  a  national  Web?  How  to  deWine  national  cultural  heritage  on  the  Web?

-­‐ Scale:  full  domain  harvest  (e-­‐depot)  or  curation?  

-­‐ Selection  criteria  and  policy

-­‐ Infrastructure,  Formats,  Accessibility

-­‐ How  is  a  web  archive  different  from  other  digital  collections  maintained  by  national  libraries?  Web  archives  as  institutions

Monday, February 18, 13

http://timeline.webarchivists.org/Monday, February 18, 13

4.  Emerging  Web  Archiving  Theory

Monday, February 18, 13

4.  Emerging  Web  Archiving  Theory

Some  distinctions:  

Monday, February 18, 13

4.  Emerging  Web  Archiving  Theory

Some  distinctions:  

-­‐ Web  archives  as  tools  for  research  /as  an  object  of  study

Monday, February 18, 13

4.  Emerging  Web  Archiving  Theory

Some  distinctions:  

-­‐ Web  archives  as  tools  for  research  /as  an  object  of  study

-­‐ Web  History  /  Digital  History

Monday, February 18, 13

4.  Emerging  Web  Archiving  Theory

Some  distinctions:  

-­‐ Web  archives  as  tools  for  research  /as  an  object  of  study

-­‐ Web  History  /  Digital  History

-­‐ Website  /  Website  in  its  archived  environment

Monday, February 18, 13

4.  Emerging  Web  Archiving  Theory

Some  distinctions:  

-­‐ Web  archives  as  tools  for  research  /as  an  object  of  study

-­‐ Web  History  /  Digital  History

-­‐ Website  /  Website  in  its  archived  environment

-­‐ Digitized  objects  /  Digital  Objects  /  “Re-­‐born  digital  objects”  (Brügger  2012)

Monday, February 18, 13

Types  of  Web  Historiography  enabled

Monday, February 18, 13

Types  of  Web  Historiography  enabled

Rogers (2013):

Monday, February 18, 13

Types  of  Web  Historiography  enabled

Rogers (2013):

- Single site historiography

Monday, February 18, 13

Types  of  Web  Historiography  enabled

Rogers (2013):

- Single site historiography

- Collection making

Monday, February 18, 13

Types  of  Web  Historiography  enabled

Rogers (2013):

- Single site historiography

- Collection making

- Link analysis, while attempting to figure out what is missing

Monday, February 18, 13

Types  of  Web  Historiography  enabled

Rogers (2013):

- Single site historiography

- Collection making

- Link analysis, while attempting to figure out what is missing

- Evolution of digital objects (such as source code, cookies or tracking devices)

Monday, February 18, 13

Single website history - Capture history of website, andplayback as screencast documentary (time-lapsed photography)

Monday, February 18, 13

"Google and the politics of tabs" by Govcom.org, Amsterdam, 2008.

Monday, February 18, 13

Collection making. Build collections from the archive(e.g., Dutch extremist sites by NRC Handelsblad)

Monday, February 18, 13

Historical link analysis over time Ben-David (2011)

Monday, February 18, 13

Weltevrede & Helmond 2012Monday, February 18, 13

Ghostery detecting trackers on an archived frontpage of the New York Times from 16 October 2006 in the Internet Archive.

Number of trackers per year on the New York Times frontpage. Green: ad, orange: tracker, blue: analytics, pink: widget. Categorization provided by Ghostery.

 Helmond (2013)

Monday, February 18, 13

Types  of  Web  Historiography  precluded

Monday, February 18, 13

Types  of  Web  Historiography  precluded

-­‐ (Most)  Web  archives  are  not  searchable

Monday, February 18, 13

Types  of  Web  Historiography  precluded

-­‐ (Most)  Web  archives  are  not  searchable

-­‐ (Most)  Web  archives  are  not  accessible  online

Monday, February 18, 13

Types  of  Web  Historiography  precluded

-­‐ (Most)  Web  archives  are  not  searchable

-­‐ (Most)  Web  archives  are  not  accessible  online

-­‐ Cross-­‐collection  comparison  is  difWicult

Monday, February 18, 13

Types  of  Web  Historiography  precluded

-­‐ (Most)  Web  archives  are  not  searchable

-­‐ (Most)  Web  archives  are  not  accessible  online

-­‐ Cross-­‐collection  comparison  is  difWicult

-­‐ Wayback  machine  “jump  cuts  through  time”  (Rogers,  2013)

Monday, February 18, 13

WebART projectWeb Archive Retrieval Tools

Jaap Kamps, Richard Rogers, Arjen de Vries, Paul Doorenbosch, René Voorburg, Victor-Jan Vos

Anat Ben-David, Hugo Huurdeman, Thaer Sammar

http://webarchiving.nl

Monday, February 18, 13

Monday, February 18, 13

THE INTERFACE

http://178.228.147.61:8080/

Monday, February 18, 13

“DICTATORS” FREQUENCY OVER TIME

0

100

200

300

400

500

600

700

800

17/05/2011 25/08/2011 03/12/2011 12/03/2012 20/06/2012 28/09/2012 06/01/2013 16/04/2013

Mubarek

Assad

Putin

Kim Jung Il

Fidel Castro

Raul Castro

New articles about “dictators” over time

Monday, February 18, 13

Monday, February 18, 13

Monday, February 18, 13

https://www.google.com/fusiontables/DataSource?docid=1uK740ETdt-Vva9lLd63h3_2hAKguAyjCS6n1-wE#map:id=3

WIRE “FORENSICS”

Monday, February 18, 13

IMAGE SEARCH RESULTS

Monday, February 18, 13

IMAGE TIMELINE

http://labs.timelessfuture.com/timeline/

Monday, February 18, 13

Questions?

Thank  you

a.ben-­‐david@uva.nl

Image: Luc Viatour / www.Lucnix.be

Monday, February 18, 13