The Sketch Engine as Infrastructure for Large Scale Text Collections for Humanities Research Adam...
-
Upload
jeremy-jacobs -
Category
Documents
-
view
216 -
download
0
Transcript of The Sketch Engine as Infrastructure for Large Scale Text Collections for Humanities Research Adam...
![Page 1: The Sketch Engine as Infrastructure for Large Scale Text Collections for Humanities Research Adam Kilgarriff Lexical Computing Ltd. & Univ of Leeds, UK.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649f155503460f94c2b206/html5/thumbnails/1.jpg)
The Sketch Engine as Infrastructure for Large Scale Text Collections for Humanities ResearchAdam KilgarriffLexical Computing Ltd. & Univ of Leeds, UK
Drag picture to placeholder or click icon to add
![Page 2: The Sketch Engine as Infrastructure for Large Scale Text Collections for Humanities Research Adam Kilgarriff Lexical Computing Ltd. & Univ of Leeds, UK.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649f155503460f94c2b206/html5/thumbnails/2.jpg)
RequirementsFast
Stable and reliable
Handle collections of any size Even billions of words
Support complex markup
Wide range of query-types, reports
Live on the web With access management
![Page 3: The Sketch Engine as Infrastructure for Large Scale Text Collections for Humanities Research Adam Kilgarriff Lexical Computing Ltd. & Univ of Leeds, UK.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649f155503460f94c2b206/html5/thumbnails/3.jpg)
RequirementsOne infrastructure, many resources
Ten-year-plus timescale With long term:
Support and maintenance Ongoing development Engagement with resource development
University research projects not designed that way Commercial: advantages
![Page 4: The Sketch Engine as Infrastructure for Large Scale Text Collections for Humanities Research Adam Kilgarriff Lexical Computing Ltd. & Univ of Leeds, UK.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649f155503460f94c2b206/html5/thumbnails/4.jpg)
Everything or just text
![Page 5: The Sketch Engine as Infrastructure for Large Scale Text Collections for Humanities Research Adam Kilgarriff Lexical Computing Ltd. & Univ of Leeds, UK.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649f155503460f94c2b206/html5/thumbnails/5.jpg)
or
![Page 6: The Sketch Engine as Infrastructure for Large Scale Text Collections for Humanities Research Adam Kilgarriff Lexical Computing Ltd. & Univ of Leeds, UK.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649f155503460f94c2b206/html5/thumbnails/6.jpg)
You can’t please all the people all the time
![Page 7: The Sketch Engine as Infrastructure for Large Scale Text Collections for Humanities Research Adam Kilgarriff Lexical Computing Ltd. & Univ of Leeds, UK.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649f155503460f94c2b206/html5/thumbnails/7.jpg)
Everything or just text Vast
Indexing – how? what search terms?
Solve the world
Small
Indexing Easy
Divide and rule
![Page 8: The Sketch Engine as Infrastructure for Large Scale Text Collections for Humanities Research Adam Kilgarriff Lexical Computing Ltd. & Univ of Leeds, UK.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649f155503460f94c2b206/html5/thumbnails/8.jpg)
Sketch EngineText only
Meets all criteria Ten years Users
Dictionary-making Oxford Univ Press, Cambridge Univ Press,
Collins, Macmillan, le Robert, Cornelsen INL and eight other national research institutes
Universities Research, teaching, language teaching
![Page 9: The Sketch Engine as Infrastructure for Large Scale Text Collections for Humanities Research Adam Kilgarriff Lexical Computing Ltd. & Univ of Leeds, UK.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649f155503460f94c2b206/html5/thumbnails/9.jpg)
LinguisticsText database = corpus (pl: corpora)
![Page 10: The Sketch Engine as Infrastructure for Large Scale Text Collections for Humanities Research Adam Kilgarriff Lexical Computing Ltd. & Univ of Leeds, UK.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649f155503460f94c2b206/html5/thumbnails/10.jpg)
![Page 11: The Sketch Engine as Infrastructure for Large Scale Text Collections for Humanities Research Adam Kilgarriff Lexical Computing Ltd. & Univ of Leeds, UK.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649f155503460f94c2b206/html5/thumbnails/11.jpg)
![Page 12: The Sketch Engine as Infrastructure for Large Scale Text Collections for Humanities Research Adam Kilgarriff Lexical Computing Ltd. & Univ of Leeds, UK.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649f155503460f94c2b206/html5/thumbnails/12.jpg)
LanguagesAround sixty
Main world languages: “tenten” corpora, order of 10b words
Web scale
![Page 13: The Sketch Engine as Infrastructure for Large Scale Text Collections for Humanities Research Adam Kilgarriff Lexical Computing Ltd. & Univ of Leeds, UK.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649f155503460f94c2b206/html5/thumbnails/13.jpg)
![Page 14: The Sketch Engine as Infrastructure for Large Scale Text Collections for Humanities Research Adam Kilgarriff Lexical Computing Ltd. & Univ of Leeds, UK.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649f155503460f94c2b206/html5/thumbnails/14.jpg)
![Page 15: The Sketch Engine as Infrastructure for Large Scale Text Collections for Humanities Research Adam Kilgarriff Lexical Computing Ltd. & Univ of Leeds, UK.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649f155503460f94c2b206/html5/thumbnails/15.jpg)
![Page 16: The Sketch Engine as Infrastructure for Large Scale Text Collections for Humanities Research Adam Kilgarriff Lexical Computing Ltd. & Univ of Leeds, UK.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649f155503460f94c2b206/html5/thumbnails/16.jpg)
![Page 17: The Sketch Engine as Infrastructure for Large Scale Text Collections for Humanities Research Adam Kilgarriff Lexical Computing Ltd. & Univ of Leeds, UK.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649f155503460f94c2b206/html5/thumbnails/17.jpg)
![Page 18: The Sketch Engine as Infrastructure for Large Scale Text Collections for Humanities Research Adam Kilgarriff Lexical Computing Ltd. & Univ of Leeds, UK.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649f155503460f94c2b206/html5/thumbnails/18.jpg)
Where nowCore technology
In place
Front end for linguists In place
Front end for other humanities scholars Good prospect Links to other resources Preliminary work with British Library Proposals welcome