Post on 21-Dec-2015
Simeon Warner - 21 April 2003
The arXiv eprint archive
CS520 guest lectureSimeon Warner
(simeon@cs.cornell.edu)
Simeon Warner - 21 April 2003
What is arXiv?
• Eprint - lots of meanings, here: pre-print, post-print, no-print, refereed or un-refereed, openly accessible via internet
• 230,000 papers (35k/year, ~150/day)• Mostly physics, some math and cs• Mostly TeX/LaTeX source• Automated PS/PDF production • 17 mirror sites world-wide, nightly update• Secondary submission site in Lyon,
France
Simeon Warner - 21 April 2003
Full-text downloads/week
Simeon Warner - 21 April 2003
History
• 1992 - hep-th created (email reflector, messages were saved). ~200 users.
• 1992 - ftp interface. hep-ph… added• 1993 - web interface• 1994 - data on remote sites moved to
main site, they become mirrors • 1995 - automatic PS from TeX• 1996 - PDF generation, mirror network
grows• 2003 - ~70,000 users
Simeon Warner - 21 April 2003
Not often cited as HCI exemplar
Simeon Warner - 21 April 2003
Submission form
Simeon Warner - 21 April 2003
Submission methods
Simeon Warner - 21 April 2003
Submissions to different areas
Simeon Warner - 21 April 2003
Why is cs archive failing?• Why are physics/math working?
Strong pre-print culture in physics Active promotion from inside communities
• CS less TeX based• CS has stronger homepage culture?• CS has Citeseer (aka ResearchIndex)• CS researchers hate arXiv interface?• CS has more conference publication?LESSON: Can’t apply `one size fits
all’
Simeon Warner - 21 April 2003
Different habits in CS (1)
Simeon Warner - 21 April 2003
Different habits in CS (2)
Simeon Warner - 21 April 2003
Current problems at arXiv
• Too much admin time 150 new/day, 30 replacements/day
• Inappropriate submissions Encouraged by arXiv’s popularity Should be “peer reviewable” quality Must be appropriate to subject areas
• Copyright PS/PDF Don’t (can’t?) automatically detect
• Submission size (otherwise) good tools produce bloated
figures
Simeon Warner - 21 April 2003
Discipline-Based Repositories, Institutional Repositories, Open
Access…• Driven by cost, ideology, self-promotion…• arXiv is exemplar archive
Largest, best-known, discipline based Need open archiving
• Current interest in institutional repositories Slice the other way to disciple based Coexist?
• Metadata sharing - OAI Essential infrastructure for more disperse
model than arXiv
Simeon Warner - 21 April 2003
Publishers and Peer-Review
• Peer-reviewed publication necessary for academic rewarding (promotion, tenure, prestige) System slow to change
• Physics publishers accept/live-with arXiv Can even submit to journals via arXiv Not true in chemistry/biology
• Peer-review costs ~$500/paper Where would that come from without
subscriptions?
Simeon Warner - 21 April 2003
Resources
• arXiv: http://arXiv.org/• Front (interface to math arXiv):
http://front.math.ucdavis.edu/• Algebraic Geometry and Topology (arXiv
overlay journal) http://www.maths.warwick.ac.uk/agt/
• OAI: http://www.openarchives.org/• BOAI: http://www.soros.org/openaccess