dd 2012 02 29 - Archive

Digital Humani,es At Scale: Hathi Trust Research Center

Beth Plale Co-‐Director, HathiTrust Research Center

Professor, School of Informa;cs and Compu;ng Indiana University

New ques;ons require computa;onal access to large corpus

Inves;gate way in which concepts of philosophy are used in physics through – Extrac;ng argumenta;ve structure from large dataset using mixture of automated and social compu;ng techniques

– Capture evidence for conjecture that availability of such analyses will enable innova;ve interdisciplinary research

– Digging into Data 2012 award, Colin Allen, IU

New ques;ons, cont.

Document through soNware text analysis techniques, the appearance, frequency and context of terms, concepts and usages related to human rights in a selec;on of English-‐language novels. – Ronnie Lipschutz of UCSC is currently doing this analysis on one of Jane Austen’s books. He’d like to extend the work to encompass a far larger corpus.

New ques;ons, cont.

Iden;fy all 18th century published books in HathiTrust corpus, and apply topic modeling to create a consistent overall subject metadata

GOOGLE DIGITAL HUMANITIES AWARDS RECIPIENT

INTERVIEWS REPORT PREPARED FOR THE HATHITRUST RESEARCH CENTER

VIRGIL E. VARVEL JR. ANDREA THOMER

CENTER FOR INFORMATICS RESEARCH IN SCIENCE AND SCHOLARSHIP

UNIVERSITY OF ILLINOIS AT URBANA-‐CHAMPAIGN

Fall 2011

The study

• Dr. John Unsworth, a representa;ve of HTRC, distributed invita;ons to par;cipate in this study via email to 22 researchers given Google Digital Humani;es Research Awards.

• Interviews were conducted via telephone, Skype®, or face-‐to-‐face, and all were audio recorded. All par;cipants agreed to IRB permission statement via email.

• A semi-‐structured interview protocol was developed with input from HTRC to elicit responses from par;cipants on primary goals of project.

Select findings

• Op;cal Character Recogni;on – Steps should be taken to improve OCR quality if and when possible

– Scalability of scanned image viewing is necessary for OCR reference and correc;on

– Metadata should expose the quality of OCR

Select Findings

• “Would like be_er metadata about text languages, par;cularly in mul;-‐text documents and on language by sec;ons within text. Automa;c language iden;fica;on func;ons would be helpful, but human-‐created metadata is preferred, par;cularly for documents with low OCR quality.”

• “primary issue was retrieving the bibliographic records in usable form, unparsed by Google. […] process took 10 months to design the queries and get the data.”

HathiTrust Research Center: dedicated to provision of computa;onal access to

comprehensive body of published works for scholarship and

educa;on

! HathiTrust is large corpus providing opportunity for new forms of computation investigation. ! The bigger the data, the less able we are to move it to a researcher’s desktop machine ! Future research on large collections will require computation moves to the data, not vice versa

Goal of HTRC

• HTRC will provide a persistent and sustainable structure to enable original and cueng edge research.

• S;mulate the development in community of new func;onality and tools to enable new discoveries that would not be possible without the HTRC.

Goal, cont.

• Leverage data storage and computa;onal infrastructure at Indiana U and UIUC,

• Provision secure computa;onal and data environment for scholars to perform research using HathiTrust Digital Library.

• Center will break new ground, allowing scholars to fully u;lize content of HathiTrust Library while preven;ng intellectual property misuse within confines of current U.S. copyright law.

02/29/12

Descr iption of Application Space in H T R C

Prepared by Jiaan 1. The Whole Diagram

dd 2012 02 29 - Archive

Documents

Transcript of dd 2012 02 29 - Archive

Score Hornpipe Humoresque - alle-noten.de · D DD DD DD DD DD D D D DD DD DD DD DDDD DD D DD Sop. Cornet Solo Cornet Rep. Cornet 2nd Cornet 3rd Cornet Flugal Hn Solo Horn 1st …

Internet Archive · 2008. 9. 29. ·

FKLSBSilicon chip_5/8 DD Silicon chip_6/8 DD Silicon chip_7/8 DD Silicon chip_8/8 a 0 cn 0 O o 70 o Created Date 8/30/2019 7:29:21 PM ...

Ukedchat Archive 29 December 2011

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA ... · dd dd dd dd dd dd dd dd dd dd dd dd dd dd dd dd dd dd dd dd ... $ ˘ ( ˆ=7˚0" 2 4$˛˘ ˜ ’ % " ˘ ...

TO BE PUBLISHED IN THE GAZETTE OF INDIA, EXTRAORDINARY ... · Podhigai, DD-Odiya, DD-Bangla, DD-Saptagiri, DD-Chandana, DD-Sahyadri, DD-Girnar, DD-Kashir, DD-NE , DD-Punjabi. (1B)

Web Dynpro Java - Archive€¦ · 4.4.3 Finished Code for Enity Bean.....27. 5 Create and Configure Enterprise Application Archive.....29. 5.1 Create EAR Project.....29 5.2 Create

AP-TL Mkt - Title Slide Mkt.pdf · free to air pack - `153.40 genre channels dd national, dd rajasthan, dd bihar, dd up, dd madhya pradesh, dd girnar, dd sahyadri, dd saptagiri, dd

DD-40 DD-60 DD-80

DD by Key & DD by Index

APPENDIX DD-1: RESILIENT MODULUS AS …onlinepubs.trb.org/onlinepubs/archive/mepdg/2appendices...DD-1.1 Introduction It is well known that in a pavement structure, with time, the moisture

HONG KONG - wonderfulpackage.com€¦ · 1 HONG KONG ‘ CITY TOUR ’ ช่วงเวลาเดินทาง mm:dd-dd / dd-dd /dd-dd mm:dd-dd / dd-dd HONG KONG ‘ CITY TOUR

dd-calendar-single-june2018 - dorkdiaries.comdorkdiaries.com/wp-content/uploads/2018/05/dd-calendar-single-june... · Title: dd-calendar-single-june2018 Created Date: 5/29/2018 9:18:59

Visa Decisions Archive 11-01-10 to 29-03-10

University Archives University Archives & Archive-It WebCom 2011-03-29.

ASIANET DIGITAL NETWORK PVT. LTD. BST.pdf · chardikla time tv chithiram cvr dd bangla dd bharati dd bihar dd chandana dd girnar dd india dd kashir dd kisan dd madhya pradesh dd malayalam

XAS 67 DD - XATS 67 DD - XAS 77 DD - XAS 97 DD...Instruction Manual for Portable Compressors XAS 67 DD - XAS 130 DD7 XATS 67 DD - XATS 125 DD7 XAS 77 DD - XAS 150 DD7 XAS 97 DD - XAS

SERIAL No. / PRODUCT No. GUIDE · 2021. 1. 29. · bw xxxxxx xxxxxx xxxxxx xxxxxxxx yyyy-mm-dd xxxxxxxxxxx yyyy-mm-dd #xxxxx yyyy-mm-dd #xxxxx yyyy-mm-dd max. min. xxxxxxxx yyyy-mm-dd

1d5f7578328744044d1b …1d5f7578328744044d1b-f74c3300c7840456dd46099e66177182.r66.… · Hepatitis A Polio MM/DD/YY MM/DD/YY MM/DD[YY MM/DD[YY MM/DD[YY MM/DD/YY Required Vaccines

1st Year Exam results Theory - lms.vpa.ac.lk · DI)/ 1 8/024 IDD/ 18/025 DD/18/026 DD/18/027 DD/18/028 DD/18/029 DD/18/030 DD/18/031 DD/18/032 DD/18/033 DD/18/034 DWI 8/035 DD/18/036