Post on 25-Dec-2015
Thesis
• Project has great potential to make good information available to more scholars.
• The future of libraries is digital.
Main Points
• Why?• Genesis: Leaders, Partners • Realities: Collections and logistics • Dreams: Comments about the
vision
Why?
“Libraries are very unevenly distributed across the world and within countries. Even in the U.S. there are enormous differences. Now technology makes possible a universal world library in which every person has access to
anything written.”
“In the end, this will be Vannevar Bush’s Memex.”
- Michael Lesk, Internet Archive
Bush, “As We May Think” Atlantic Monthly (July 1945)
http://www.theatlantic.com/unbound/flashbks/computer/bushf.htm
Why?
Leaders
• Co-Directors: Raj Reddy and Gloriana St. Clair
• Michael Shamos: intellectual property; e-commerce
• Jaime Carbonell: machine translation; information mining; auto-summarization
Leaders
• Gabrielle Michalek: pioneered five digitization projects
• Erika Linke: collections; intellectual property
• Denise Troll Covey: digital libraries; user studies
Partners
• National Science Foundation
2001 $665,6002002 $1,000,0002003 $1,000,0002004 $1,000,000
for equipment and travel
Partners
• Carnegie Mellon University• Carnegie Library of Pittsburgh• Indiana University• National Agriculture Library• OCLC• Penn State University• Stanford University• University of California,
Berkeley• University of Washington
Partners
• China: Beijing University • Chinese Academy of Science • Fudan University • Ministry of Education of China • Nanjing University • State Planning Commission of China • Tsinghua University • Zhejiang University
• India: Arulmigu Kalasalingam College of Engineering • Goa University • Indian Institute of Science • Indian Institute of Information Technology–Allahabad • International Institute of Information Technology–Hyderabad • Maharashtra Industrial Development Corporation • Tirumala Tirupati Devasthanams • Shanmugha Arts, Science, Technology and Research Academy • University of Pune
Collections
• BCL– 60,000 “best” for college libraries• U. S. government documents • British Parliamentary Papers• Partners’ unique cultural treasures• University press negotiations• Copyright clearance projects for targeted subject areas
Constraints
• The collection must be composed of many sub-collections.
• Librarians will consulted to ensure solid selection criteria.
• Copyright is a serious barrier to an effective effort.
Research Initiatives
• Machine translation• Massive distributed
database• Storage formats• Use of digital libraries• Distribution and
sustainability
• Security• Search engines• Image processing• Optical Character
Recognition (OCR)• Language processing• Copyright laws
Distribution and Sustainability
• Library of Congress, Digital Preservation
• OCLC• RLG• STOR family• University presses• Commercial vendors
Newest Developments
• Over 1 million pages scanned at one center in India
• Famous Hyderabad library to be scanned
• Million Book Project FAQ http://www.library.cmu.edu/Libraries/MBP_FAQ.html
Thank You and Q & A
Gloriana St. Clair
Carnegie Mellon University
gstclair@andrew.cmu.edu