Reconstructing the past with media wiki
-
Upload
shawn-jones -
Category
Software
-
view
573 -
download
3
description
Transcript of Reconstructing the past with media wiki
Reconstructing the past with MediaWiki:
Programmatic Issues and Solutions
Shawn M. [email protected]
Old Dominion University
Reconstructing the Past with the Internet Archive
HTML
Images
JavaScript
CSS
Our goal: Temporal CoherenceMake the page look as it looked at the time it was archived.
Some Results from the Internet Archive Are Lacking
Images change between the time the Archive crawls the main page and the time it gets to the images
Sometimes embedded images are missing when the Archive gets to them
Sometimes the page is designed for a specific browser in mind
Image from “A Framework for Evaluation of Composite Memento Temporal Coherence” by S. Ainsworth, M. L. Nelson, H. Van de Sompel. http://arxiv.org/abs/1402.0928
MediaWiki Shouldn’t Have This Problem
HTML Images
JavaScript
CSS
What we’re not doing
Interest in Reconstructing the Past With MediaWiki
Simplified Memento Overview
Rules for Reconstructing the Past With MediaWiki
Do not modify any existing MediaWiki code!
Conform to MediaWiki coding standards
And…
Reconstructing the Past
Articles
Templates
Embedded Images
Embedded JavaScript
Embedded CSS
Accessing Old Article Text
The oldid argument references a revision of a page within MediaWiki's database
Merely visiting the URI with the oldid will give you the text content of the page as it existed at that revision
Reconstructing the Past
ArticlesHandled by Memento MediaWiki Extension
Templates
Embedded Images
Embedded JavaScript
Embedded CSS
Including the Right Template
This gives us:$title - the Title object for the given page$parser - the Parser object for the given page$id - the revision ID (oldid) for the Template page
Using $parser, and $title, we can change the $id and fetch an old revision of the Template
Reconstructing the Past
ArticlesHandled by Memento MediaWiki Extension
TemplatesHandled by Memento MediaWiki Extension
Embedded Images
Embedded JavaScript
Embedded CSS
But What About Images?
This Map is important to understanding the content of this article
This image is changed as the article is changed, to reflect its content
It’s the same map if we look at the June 6, 2013 revision now
Users can't view this embedded resource as it looked on June 2013 while reading the article from that time period
What should have happenedThis is the the map from June, 2013 that should have been displayed
This is the current map
The content of the article won't match the data in this visual aide, possibly confusing a user who wanted historical information on this topic
We Tried To Solve This
Upon further inspection of the code in MediaWiki, the $time argument from this function is never used as detailed here
We Just Solved This
Upon further inspection of the code in MediaWiki, the $file argument’s getHistory() function can be used to acquire previous revisions of images
Reconstructing the Past
ArticlesHandled by Memento MediaWiki Extension
TemplatesHandled by Memento MediaWiki Extension
Embedded ImagesPrototyped for future version ofMemento MediaWiki Extension
Embedded JavaScript
Embedded CSS
What about CSS/JavaScript?
The present CSS of this page conflicts with the past Template.
We Couldn’t Solve This
The data is present, but we could not find any way for an extension to access or render it.
Recap on Reconstructing the Past
ArticlesHandled by Memento MediaWiki Extension
TemplatesHandled by Memento MediaWiki Extension
Embedded ImagesPrototyped for future version ofMemento MediaWiki Extension
Embedded JavaScriptRequires changes to MediaWiki
Embedded CSSRequires changes to MediaWiki
Uniform solution
• RFC 7089, Memento, was designed to provide uniform access to past versions of all resources on the Web
• Memento provides a web standard to access these resources
Resources• Memento Protocol: http://tools.ietf.org/html/rfc7089• Memento Website: http://www.mementoweb.org/• Memento MediaWiki Extension:
http://www.mediawiki.org/wiki/Extension:Memento• Memento Chrome Extension:
http://bit.ly/memento-for-chrome
• More details:http://ws-dl.blogspot.com/2014/04/2014-04-01-yesterdays-wiki-page-todays.html
• Contact me: [email protected]
Backup Slides
Sample URI-R (Step 1) HTTP Response
HTTP/1.1 200 OKDate: Sun, 25 May 2014 21:39:02 GMTServer: ApacheX-Content-Type-Options: nosniffLink: http://ws-dl-05.cs.odu.edu/demo/index.php/Daenerys\_Targaryen; rel="original latest-version",
http://ws-dl-05.cs.odu.edu/demo/index.php/Special:TimeGate/Daenerys\_Targaryen; rel="timegate",
http://ws-dl-05.cs.odu.edu/demo/index.php/Special:TimeMap/Daenerys\_Targaryen; rel="timemap”; type="application/link-format”Content-language: enVary: Accept-Encoding,CookieCache-Control: s-maxage=18000, must-revalidate, max-age=0Last-Modified: Sat, 17 May 2014 16:48:28 GMTConnection: closeContent-Type: text/html; charset=UTF-8
Sample URI-G (Step 2) HTTP Response
HTTP/1.1 302 FoundDate: Sun, 25 May 2014 21:43:08 GMTServer: ApacheX-Content-Type-Options: nosniffVary: Accept-Encoding, Accept-DatetimeLocation: http://ws-dl-05.cs.odu.edu/demo/index.php?title=Daenerys_Targaryen&oldid=1499Link: <http://ws-dl-05.cs.odu.edu/demo/index.php/Special:TimeMap/Daenerys_Targaryen>; rel="timemap”; type="application/link-format",
<http://ws-dl-05.cs.odu.edu/demo/index.php/Daenerys_Targaryen>; rel="original latest-version”Connection: closeContent-Type: text/html; charset=UTF-8
Sample URI-M (Step 3) HTTP Response
HTTP/1.1 200 OKDate: Sun, 25 May 2014 21:46:12 GMTServer: ApacheX-Content-Type-Options: nosniffMemento-Datetime: Sun, 22 Apr 2007 15:01:20 GMTLink: <http://ws-dl-05.cs.odu.edu/demo/index.php/Daenerys_Targaryen>; rel="original latest-version”,
<http://ws-dl-05.cs.odu.edu/demo/index.php/Special:TimeGate/Daenerys_Targaryen>; rel="timegate”,
<http://ws-dl-05.cs.odu.edu/demo/index.php/Special:TimeMap/Daenerys_Targaryen>; rel="timemap”; type="application/link-format”Content-language: enVary: Accept-Encoding,CookieExpires: Thu, 01 Jan 1970 00:00:00 GMTCache-Control: private, must-revalidate, max-age=0Connection: closeContent-Type: text/html; charset=UTF-8