Replicating Linguistic Resources
Transcript of Replicating Linguistic Resources
![Page 1: Replicating Linguistic Resources](https://reader031.fdocuments.in/reader031/viewer/2022012021/6169b5db11a7b741a34a8149/html5/thumbnails/1.jpg)
Replicating Linguistic Resources
B2SAFE: MPI-TLA CLARIN Center
Willem Elbers (MPI-TLA)
2nd EUDAT Conference
Date: 29 October 2013
![Page 2: Replicating Linguistic Resources](https://reader031.fdocuments.in/reader031/viewer/2022012021/6169b5db11a7b741a34a8149/html5/thumbnails/2.jpg)
The Language Archive
2
• Data on languages:
– about 60 Terabyte of well-described resources
– about 20.000 hours of digitized audio/video recordings
– about 73.000 metadata described sessions
– about 4.5 million annotated segments
– data on more than 200 languages
– among these, data from about 60 DOBES teams
– acquisition, speech, multimodal, multilingual, language and cognition,
brain imaging, ethnological and other data.
• Mission:
– Maintaining access to all stored resources for the current generation of
researchers, language communities and the interested public.
– Preserve the valuable cultural heritage for current en future generations.
![Page 3: Replicating Linguistic Resources](https://reader031.fdocuments.in/reader031/viewer/2022012021/6169b5db11a7b741a34a8149/html5/thumbnails/3.jpg)
B2SAFE
• Goals
– Replication of data
• B2SAFE!
– Replication of services
• RZG providing Language Archive Technology services at
replica side
• B2SAFE Community extensions:
– Replication based on logical structure defined in the IMDI/CMDI
metadata
– Integrated with underlying SAM-FS
3
![Page 4: Replicating Linguistic Resources](https://reader031.fdocuments.in/reader031/viewer/2022012021/6169b5db11a7b741a34a8149/html5/thumbnails/4.jpg)
4
Approx:
3TB, #objects
Approx:
3TB, #objects
![Page 5: Replicating Linguistic Resources](https://reader031.fdocuments.in/reader031/viewer/2022012021/6169b5db11a7b741a34a8149/html5/thumbnails/5.jpg)
Summary
“Cultural Heritage Data replicated for the future”
• Data replication running in production
• LAT Software stack running @ RZG (beta)
• Replication of authorization records running (beta)
5
![Page 6: Replicating Linguistic Resources](https://reader031.fdocuments.in/reader031/viewer/2022012021/6169b5db11a7b741a34a8149/html5/thumbnails/6.jpg)
Summary
“Cultural Heritage Data replicated for the future”
6