Populating the infrastructure the case of the Netherlands
description
Transcript of Populating the infrastructure the case of the Netherlands
![Page 1: Populating the infrastructure the case of the Netherlands](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681458c550346895db2742d/html5/thumbnails/1.jpg)
Populating the infrastructurethe case of the Netherlands
Hans Bennisexecutive board of CLARIN-NL
Meertens Institute (KNAW)
CLARIN COORDINATORSBUDAPEST, June 29-30
1
![Page 2: Populating the infrastructure the case of the Netherlands](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681458c550346895db2742d/html5/thumbnails/2.jpg)
the start in 2009
• 9 million Euro for CLARIN-NL for the period 2009-2015 (requested amount m€ 25)
• concentration on text (language data for humanities research)• audio and video are left out, in contrast to the original proposal• social sciences are not included, in contrast to the orginal proposal• organizational structure: director, executive board, board, advisory
panels (national and international)• substantial part of money will be spent in programmatic form
through Calls• important goal / ambition: create broad support for CLARIN in
humanities research in the Netherlands
2
![Page 3: Populating the infrastructure the case of the Netherlands](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681458c550346895db2742d/html5/thumbnails/3.jpg)
Projects 2009• technical projects (centers, metadata, web services,
workflow, etc.)• centers: Max Planck Institute for Psycholinguistics (MPI,
Nijmegen), Meertens Institute (Amsterdam), DANS (Den Haag) and Institute for Dutch Lexicology (INL, Leiden)
• user survey• Call-1 (Demonstrator Projects or Resource Curation
projects)• 12 projects (+/- € 60.000 each)
– demonstrator projects– data curation projects
3
![Page 4: Populating the infrastructure the case of the Netherlands](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681458c550346895db2742d/html5/thumbnails/4.jpg)
Call-1 Projects1) AAM-LR [UNijmegen/MPI] - Automatic annotation of
language resources2) Adelheid [UNijmegen/MPI] – Lemmatizer for
Historical Dutch3) Adept [UGroningen/Meertens] – Dialect Analysis4) Duelme-LMF [UUtrecht/INL] – Multi-word
expressions5) INTER-VIEWS [UNijmegen/DANS] – Interviews of life-
history of veterans6) MIMORE [UUtrecht/Meertens] – Dialect
morphosyntax7) SignLinC [UNijmegen/MPI] – Sign Language
4
![Page 5: Populating the infrastructure the case of the Netherlands](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681458c550346895db2742d/html5/thumbnails/5.jpg)
Call-1 (more)
8) TDS Curator [UUtrecht/DANS] – Typological Database
9) TICCLops [UTilburg/INL] – Text Clean-up10) TQE [UNijmegen/MPI]Transcription evaluation11) WFT-GTB [Fryske Akademy/INL] – Integration of
Dutch and Frisian dictionaries12) CKCC [UUtrecht, Huygens Institute, DANS]
Correspondence of scholars in 17th century
5
![Page 6: Populating the infrastructure the case of the Netherlands](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681458c550346895db2742d/html5/thumbnails/6.jpg)
Demonstration of the Microcomparative Morphosyntactic
Research Tool
MIMORESjef Barbiers, Matthijs Brouwer,
Jan Pieter Kunst, Folkert de Vriend
Meertens Instituut, 2011
6
![Page 7: Populating the infrastructure the case of the Netherlands](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681458c550346895db2742d/html5/thumbnails/7.jpg)
Opening screen MIMORE
7
![Page 8: Populating the infrastructure the case of the Netherlands](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681458c550346895db2742d/html5/thumbnails/8.jpg)
Research question• The Standard Dutch [non-neuter] relative pronoun
and distal demonstrative has the form ‘die’ (that, those).
• We know that there are dialects that have ‘dien’ as a relative pronoun and/or as a distal demonstrative.
• We would like to know if there is a correlation between ‘dien’ as a relative pronoun, ‘dien’ as a demonstrative preceding a noun, and ‘dien’ as a demonstrative in elliptical constructions.
• The linguistic question behind this search is what the ‘-n’ on ‘die’ is: case, phonologically determined, etc.?
8
![Page 9: Populating the infrastructure the case of the Netherlands](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681458c550346895db2742d/html5/thumbnails/9.jpg)
Optional restrictions on the search
9
![Page 10: Populating the infrastructure the case of the Netherlands](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681458c550346895db2742d/html5/thumbnails/10.jpg)
Search 1: DynaSAND with text string and tag constructor: ‘dien’ as relative pronoun
10
![Page 11: Populating the infrastructure the case of the Netherlands](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681458c550346895db2742d/html5/thumbnails/11.jpg)
Elements of search result
11
![Page 12: Populating the infrastructure the case of the Netherlands](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681458c550346895db2742d/html5/thumbnails/12.jpg)
Specification of data resource
12
![Page 13: Populating the infrastructure the case of the Netherlands](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681458c550346895db2742d/html5/thumbnails/13.jpg)
Corresponding sound fragment
13
![Page 14: Populating the infrastructure the case of the Netherlands](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681458c550346895db2742d/html5/thumbnails/14.jpg)
Search 2: GTRP with demonstrative + N in test item
14
![Page 15: Populating the infrastructure the case of the Netherlands](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681458c550346895db2742d/html5/thumbnails/15.jpg)
Elements of search result
15
![Page 16: Populating the infrastructure the case of the Netherlands](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681458c550346895db2742d/html5/thumbnails/16.jpg)
Result of search 3: demonstrative ‘dien’ in elliptical nominal groups in DIDDD
16
![Page 17: Populating the infrastructure the case of the Netherlands](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681458c550346895db2742d/html5/thumbnails/17.jpg)
Available operations on search results
17
![Page 18: Populating the infrastructure the case of the Netherlands](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681458c550346895db2742d/html5/thumbnails/18.jpg)
Map combining three search results
18
![Page 19: Populating the infrastructure the case of the Netherlands](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681458c550346895db2742d/html5/thumbnails/19.jpg)
Map combiningtwo search results
19
![Page 20: Populating the infrastructure the case of the Netherlands](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681458c550346895db2742d/html5/thumbnails/20.jpg)
Frequency maps
20
![Page 21: Populating the infrastructure the case of the Netherlands](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681458c550346895db2742d/html5/thumbnails/21.jpg)
Creating the intersection of two sets of search results
21
![Page 22: Populating the infrastructure the case of the Netherlands](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681458c550346895db2742d/html5/thumbnails/22.jpg)
Export as Excel-file
22
![Page 23: Populating the infrastructure the case of the Netherlands](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681458c550346895db2742d/html5/thumbnails/23.jpg)
Data exported
23
![Page 24: Populating the infrastructure the case of the Netherlands](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681458c550346895db2742d/html5/thumbnails/24.jpg)
Complex search: More thanone database, string of tags
24
![Page 25: Populating the infrastructure the case of the Netherlands](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681458c550346895db2742d/html5/thumbnails/25.jpg)
CALL-2 (2011)
1) Arthurian Fiction [UUtrecht] - Curation of two databases for literary research
2) C-DSD [UUtrecht/Meertens] Curation of Folksong Database
3) COAVA [Meertens] bringing together five linguistic databases (language variation/acquisition)
4) INPOLDER [UNijmegen/Meertens] Syntactic analysis of historical Dutch
5) IPROSLA [UNijmegen/UAmsterdam/MPI] Sign language databases
25
![Page 26: Populating the infrastructure the case of the Netherlands](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681458c550346895db2742d/html5/thumbnails/26.jpg)
CALL-2 (more)
6) NEHOL [UNijmegen] – Curation of Negerhollands database
7) VU-DNC [VU-Amsterdam] – corpus of Dutch newspapers
8) WAHSP [UUtrecht] – Text mining in large historical databases
9) WIP [NIOD] – Data curation of Dutch Second World War database
26
![Page 27: Populating the infrastructure the case of the Netherlands](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681458c550346895db2742d/html5/thumbnails/27.jpg)
developments• collaboration with CATCH-programme (programme to finance projects for teams of ict-developers,
humanities scholars and cultural heritage institutions) – CLAVAS – vocabularies– Persistent Identifiers
• Data Curation Service (>2011) • Call 3 (call open now; projects in 2012)• Agreement with Dutch Science Foundation (NWO) and
Royal Netherlands Academy of Science (KNAW) with respect to CLARIN-norm for databases/tools in humanities
• CLARIN-NL + DARIAH-NL => CLARIAH – Dutch Roadmap
27