“ WARP : Web Archiving Project”
-
Upload
jael-mccarty -
Category
Documents
-
view
21 -
download
1
description
Transcript of “ WARP : Web Archiving Project”
““WARPWARP: Web Archiving : Web Archiving
Project”Project”
JSIST Class 2007JSIST Class 2007Kenichiro ShimadaKenichiro Shimada
Gordon W. Prange CollectionGordon W. Prange CollectionUniversity of Maryland LibrariesUniversity of Maryland Libraries
NCC 2008 Open Meeting NCC 2008 Open Meeting Hyatt Regency Hotel, Atlanta, April 3, 2008Hyatt Regency Hotel, Atlanta, April 3, 2008
WARP : NDL’s Research Tools for WARP : NDL’s Research Tools for a Type of Web Resourcesa Type of Web Resources
One-stop search for free databases One-stop search for free databases and archived web sites and archived web sites
Alternative web research tools forAlternative web research tools for
-- Obsolete web pages Obsolete web pages
WARP: Web Archiving ProjectWARP: Web Archiving Projecthttp://warp.ndl.go.jp/http://warp.ndl.go.jp/
2,100 websites and 1,500 title e-2,100 websites and 1,500 title e-journals journals
(Jan. 2008) (Jan. 2008)
Fed. Gov., Prefectures, Designated Fed. Gov., Prefectures, Designated Cities, Cities,
Municipal merger, Foundations, Municipal merger, Foundations, Organizations , Universities, Events, E-Organizations , Universities, Events, E-Journals Journals
Web Archiving Project (Web Archiving Project (Cont. )Cont. )
Harvesting Web Resources by Robot Harvesting Web Resources by Robot
HTML (front page) and linked pages HTML (front page) and linked pages - Not possible to collect Deep web pages - Not possible to collect Deep web pages …… CGI (CGI (Common Gateway InterfaceCommon Gateway Interface) implemented etc) implemented etc
http://warp.ndl.go.jp/ft_WARP_Mechanism.pdf
Permission-basedPermission-based
Stakeholders can specify (limit) the extent of web Stakeholders can specify (limit) the extent of web harvesting and access harvesting and access
Advantages: WARPAdvantages: WARP
WARPWARP
-- Can retrieve obsolete web Can retrieve obsolete web pages (currently not available) pages (currently not available) e.g. Website of e.g. Website of ““Japanese Organizing Japanese Organizing Committee for the 2002 FIFACommittee for the 2002 FIFA
World Cup Korea/Japan World Cup Korea/Japan
Original URL: Original URL: http://www.jawoc.or.jp/index.html
Example of Obsolete Website available Example of Obsolete Website available
through WARP through WARP
Notes : WARP Notes : WARP
WARPWARP-- Not all government bodies have Not all government bodies have given consent given consent
e.g. Prefectures (33 out of 47) e.g. Prefectures (33 out of 47)
Ordinance-designated cities (13 out of 17)Ordinance-designated cities (13 out of 17)
-- Stakeholders can specify the extent Stakeholders can specify the extent of harvesting and access of harvesting and access
-- Check the date of data harvesting Check the date of data harvesting
Example of Harvesting SpecificationExample of Harvesting Specification
Example of Access Restriction by Example of Access Restriction by Stakeholder 1Stakeholder 1
Example of Access Restriction by Example of Access Restriction by Stakeholder 2Stakeholder 2