Banal Because Format Checking is So Trite

6
Banal Banal Because Format Checking Because Format Checking is So Trite is So Trite Geoffrey M. Voelker Geoffrey M. Voelker University of California, San Diego University of California, San Diego Workshop on Organizing Workshops, Conferences, Workshop on Organizing Workshops, Conferences, and Symposia for Computer Systems (WOWCS’08) and Symposia for Computer Systems (WOWCS’08)

description

Banal Because Format Checking is So Trite. Geoffrey M. Voelker University of California, San Diego Workshop on Organizing Workshops, Conferences, and Symposia for Computer Systems (WOWCS’08). This Talk is Not Very Interesting. Banal is a format checker for PDF documents - PowerPoint PPT Presentation

Transcript of Banal Because Format Checking is So Trite

Page 1: Banal Because Format Checking  is So Trite

BanalBanal

Because Format Checking Because Format Checking is So Triteis So Trite

Geoffrey M. VoelkerGeoffrey M. VoelkerUniversity of California, San DiegoUniversity of California, San Diego

Workshop on Organizing Workshops, Conferences, and Workshop on Organizing Workshops, Conferences, and Symposia for Computer Systems (WOWCS’08)Symposia for Computer Systems (WOWCS’08)

Page 2: Banal Because Format Checking  is So Trite

This Talk is This Talk is Not Very InterestingNot Very Interesting

Banal is a format checker for PDF documents

Deduces how a document was formatted Optionally compares it with a specification

Intended for conference management systems Now being used in HotCRP and EDAS Seemed timely to document its genesis and implementation

April 15, 2008 WOWCS’08 2

Page 3: Banal Because Format Checking  is So Trite

Why?Why? Preserving reviewer anonymity

Acrobat javascript that calls home when pdf is loaded Assisting conference management tasks

Ensuring anonymity rules Possibly helping do initial assignments by mining the bib

Fairness Everyone else obeyed the rules…

Time Already enough time spent on reviewing Frustrated that abuse meant taking even more of my time

April 15, 2008 WOWCS’08 3

Page 4: Banal Because Format Checking  is So Trite

How?How? Convert PDF

To XML (with pdftohtml)

Track the locations of all segments of text, essentially form bounding boxes

Compute margins, columns, body font, etc. Heuristics for page #s, headers, footers, etc.

April 15, 2008 WOWCS’08 4

Page 5: Banal Because Format Checking  is So Trite

Where?Where? A handful of SIGOPS/SIGCOMM conferences

OSDI’06, SIGCOMM’07, SIGCOMM’08 Eddie Kohler has integrated it into HotCRP

Henning Schulzrinne also integrated banal with EDAS Since 2006, used for over 800 events

April 15, 2008 WOWCS’08 5

Page 6: Banal Because Format Checking  is So Trite

So?So? What are our community goals for having formatting

requirements? Evil: Annoying trifles that negatively impact our ability to

communicate our results and ideas? Helpful: Reflect practicalities of publishing costs and

community time? Not surprisingly, I’m in the practical camp

April 15, 2008 WOWCS’08 6