The FIDA & MULTEXT-The FIDA & MULTEXT-East language East language resourcesresources
Tomaž ErjavecTomaž Erjavec
Department of Knowledge TechnologiesDepartment of Knowledge Technologies
Jožef Stefan Institute, LjubljanaJožef Stefan Institute, Ljubljana
[email protected], [email protected], http://nl.ijs.si/et/http://nl.ijs.si/et/
Gralis 2006Gralis 2006
Institut für Slawistik der Universität GrazInstitut für Slawistik der Universität Graz
2006-05-092006-05-09
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
OverviewOverview
1.1. BackgroundBackground
2.2. FIDA: a reference corpus of FIDA: a reference corpus of SloveneSlovene
3.3. MULTEXT-East: morphosyntactic MULTEXT-East: morphosyntactic resources for Central and East-resources for Central and East-European languagesEuropean languages
4.4. Other language resources for Other language resources for SloveneSlovene
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
Language ResourcesLanguage Resources
LR comprise three layers of data: LR comprise three layers of data: – corpora: mono- or multilingual, reference or corpora: mono- or multilingual, reference or
specialised, … /variously annotated/specialised, … /variously annotated/– lexica: vocabularies, morphosyntactic, syntactic, lexica: vocabularies, morphosyntactic, syntactic,
semantic, (ontologies)semantic, (ontologies)– standards: linguistic and technical encodingstandards: linguistic and technical encoding
LRs, esp. corpora are used for LRs, esp. corpora are used for empirical language research:empirical language research:– linguistic studies:linguistic studies:
(annotated) corpus + (sophisticated) search engine(annotated) corpus + (sophisticated) search engine– human language technology R&D:human language technology R&D:
testing and training datasettesting and training dataset
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
Part I.Part I.The FIDA corpusThe FIDA corpus
Slovene reference corpus for Slovene reference corpus for linguistic studieslinguistic studies
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
FIDA FIDA http://www.fida.net/http://www.fida.net/
Joint project (1997-2000) of Joint project (1997-2000) of FFilozofska fakultetailozofska fakulteta
Vojko Gorjanc, Marko Stabej, Špela VintarVojko Gorjanc, Marko Stabej, Špela Vintar IInstitut Jonstitut Jožef Stefanžef Stefan
Tomaž ErjavecTomaž Erjavec DDZSZS
Simon KrekSimon Krek AAmebismebis
Peter Holozan, Miro RomihPeter Holozan, Miro Romih
Financed by industry partnernsFinanced by industry partnerns
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
Characteristics of FIDACharacteristics of FIDA
monolingualmonolingual synchronoussynchronous written languagewritten language referencereference
– representativerepresentative– balancedbalanced
annotatedannotated
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
SizesSizes
Total Total 103,513,072103,513,072 wordswords 29,177 29,177 textstextsAvgAvg.. text length text length3,5483,548 words words
Largest texts:Largest texts: Leksikon DZS: Leksikon DZS: 508,370 508,370 wordswords69 69 texts > texts > 100.000100.000
Smallest texts:Smallest texts: 2.648 2.648 < < 100100 words words2 x 2 x <w>rezgrtshdrghgth4</w><w>rezgrtshdrghgth4</w>
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
Time CompositionTime Composition
Oldest/most recent textOldest/most recent text: : 19891989//20002000
Average date Average date 1997-021997-02 Texts/Words with unknTexts/Words with unknown own datedate: :
3.943.94%/%/8.288.28%%
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
FIDA tFIDA taxonomoyaxonomoy::publication typespublication types
……Ft.P.P.O (published) Ft.P.P.O (published) 95.72%95.72%Ft.P.P.O.K (books) Ft.P.P.O.K (books) 22.71%22.71%Ft.P.P.O.P (periodicals) Ft.P.P.O.P (periodicals) 70.50%70.50%Ft.P.P.O.P.C (newspaper) Ft.P.P.O.P.C (newspaper) 46.59%46.59%Ft.P.P.O.P.C.D (daily) Ft.P.P.O.P.C.D (daily) 32.67%32.67%Ft.P.P.O.P.C.T (weekly) Ft.P.P.O.P.C.T (weekly) 66.18%66.18%Ft.P.P.O.P.C.V (multi-weekly)Ft.P.P.O.P.C.V (multi-weekly) 17.74%17.74%……
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
FIDA tFIDA taxonomoyaxonomoy: : ttext typesext typesFt.Z (text type) Ft.Z (text type) 99.47%99.47%Ft.Z.N (non-ficiton) Ft.Z.N (non-ficiton) 93.57%93.57%Ft.Z.N.N (non-professional)Ft.Z.N.N (non-professional) 75.14%75.14%Ft.Z.N.S (professional) Ft.Z.N.S (professional) 18.37%18.37%Ft.Z.N.S.H (hum. & soc. sci.)Ft.Z.N.S.H (hum. & soc. sci.) 10.57%10.57%Ft.Z.N.S.N (nat. & tech. sci.) Ft.Z.N.S.N (nat. & tech. sci.) 6.04% 6.04%Ft.Z.U (fiction) Ft.Z.U (fiction) 5.90% 5.90%Ft.Z.U.D (drama) Ft.Z.U.D (drama) 0.10% 0.10%Ft.Z.U.P (poetry) Ft.Z.U.P (poetry) 0.17% 0.17%Ft.Z.U.R (prose) Ft.Z.U.R (prose) 5.12% 5.12%
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
Markup of FIDAMarkup of FIDA
corpus elements annotated with corpus elements annotated with meta-data (bibliographic, taxonomy)meta-data (bibliographic, taxonomy)
text linguistically annotatedtext linguistically annotated encoded according to international encoded according to international
standards and recommendationsstandards and recommendations– technical: SGML, TEI P3technical: SGML, TEI P3– linguistic: MULTEXT-Eastlinguistic: MULTEXT-East
(MULTEXT, EAGLES)(MULTEXT, EAGLES)
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
Linguistic annotationLinguistic annotation
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
AccesibilityAccesibility
Exploitation by pExploitation by partners:artners:– DZS: new dictionariesDZS: new dictionaries– Amebis: Amebis: development of HLTdevelopment of HLT– Arts faculty: Arts faculty: teachingteaching– IJS: research on HLTIJS: research on HLT
Availability to the pAvailability to the public:ublic:– access via caccess via concordance engine by Amebis oncordance engine by Amebis – free accessfree access, but, but displays only few hits displays only few hits– possibility of academic licencespossibility of academic licences
FIDA (web site) no longer maintainedFIDA (web site) no longer maintained!!
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
FIDA+ FIDA+ http://www.fidaplus.nehttp://www.fidaplus.net/t/ FIDA Plus project:FIDA Plus project:
– FilozofskFilozofskaa fakultet fakultetaa, Fakulteta za družbene vede, , Fakulteta za družbene vede, Institut Jožef StefanInstitut Jožef Stefan
– DZS, AmebisDZS, Amebis Financed by the ministryFinanced by the ministry + ind. partners+ ind. partners Extend the corpus with Extend the corpus with
– Web materialsWeb materials– spoken componentspoken component
Better linguistic markupBetter linguistic markup Free cFree concordances: up to 100 linesoncordances: up to 100 lines Also possibility of licencesAlso possibility of licences
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
ConcordancerConcordancer
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
OutputOutput
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
Extended searchesExtended searches
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
Corpus “Nova Beseda”Corpus “Nova Beseda”http://bos.zrc-sazu.si/http://bos.zrc-sazu.si/
being developed at Institute for being developed at Institute for Slovene language, ZRC SAZU Slovene language, ZRC SAZU (Primo(Primož Jakopin)ž Jakopin)
Web concordancer with no Web concordancer with no hit hit limitlimit now now larger than FIDAlarger than FIDA but but much less variedmuch less varied::
fiction, Delo, DZ fiction, Delo, DZ not linguistically annotatednot linguistically annotated
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
Part II.Part II.MULTEXT-EastMULTEXT-East
multilingual morphosyntactic multilingual morphosyntactic resources for HLT resources for HLT developmentdevelopment
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
MULTEXT-East MULTEXT-East resourcesresources
MULTEXT-EastMULTEXT-East: Copernicus Joint Project COP 106 : Copernicus Joint Project COP 106 (1995-1997) (1995-1997) Multilingual Texts and Corpora for Multilingual Texts and Corpora for Eastern and Central European LanguagesEastern and Central European Languages
Based on the results of EU MULTEXT (~West)Based on the results of EU MULTEXT (~West) To produce a harmonised To produce a harmonised BLARKBLARK for six for six
languages:languages:– corpus encoding standardisation (TEI / CES)corpus encoding standardisation (TEI / CES)– multilingual parallel, comparable, speech corporamultilingual parallel, comparable, speech corpora– morphosyntactic specifications (EAGLES / MULTEXT)morphosyntactic specifications (EAGLES / MULTEXT)– (inflectional) lexicon(inflectional) lexicon– annotated corpusannotated corpus– language processing toolslanguage processing tools
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
History of MULTEXT-History of MULTEXT-East resourcesEast resources First release 1998 on TELRI CD-ROM Vol II:First release 1998 on TELRI CD-ROM Vol II:
already extended with new languagesalready extended with new languages Resources since 1998 available on the Web:Resources since 1998 available on the Web:
http://nl.ijs.si/ME/http://nl.ijs.si/ME/ Second release 2002 in scope of EU CONCEDE:Second release 2002 in scope of EU CONCEDE:
re-encoding in XML/TEI, harmonisationre-encoding in XML/TEI, harmonisation Third releaseThird release 2004: 2004:
merge of first two releases, further languagesmerge of first two releases, further languages Work (indirectly) supported by:Work (indirectly) supported by:
TELRI, CONCEDE, NSF grant, bi-lateral projectsTELRI, CONCEDE, NSF grant, bi-lateral projects
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
The Languages of The Languages of MULTEXT-EastMULTEXT-East
Germanic: Germanic: EnglishEnglish Romance: Romance:
RomanianRomanian Baltic: Baltic:
– LatvianLatvian – LithuanianLithuanian
Finno-Ugric: Finno-Ugric: – EstonianEstonian – HungarianHungarian
Slavic: Slavic: Russian (East Slavic)Russian (East Slavic) CzechCzech (West Slavic) (West Slavic) Slovene Slovene (South West Slavic) (South West Slavic) Resian (Slovene dialect) Resian (Slovene dialect) Croatian (South West Slavic) Croatian (South West Slavic) SerbianSerbian (South West Slavic) (South West Slavic) Bulgarian (South East Slavic)Bulgarian (South East Slavic)
In progress:In progress: MacedonianMacedonian Persian Persian
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
Version 3Version 3
Available on Available on http://nl.ijs.si/ME/V3/http://nl.ijs.si/ME/V3/ Some parts completely free, others Some parts completely free, others
free for research free for research Web licence Web licence Web pages gives:Web pages gives:
– extensive documentationextensive documentation– bibliography listbibliography list– web licence formweb licence form– resource downloadresource download
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
The MULTEXT The MULTEXT morphosyntactic morphosyntactic trinitytrinity1.1. MULTEXT-East morphosyntactic MULTEXT-East morphosyntactic
specificationsspecifications
2.2. MULTEXT-East morphosyntactic MULTEXT-East morphosyntactic lexica lexica
3.3. MULTEXT-East MULTEXT-East morphosyntactically annotated morphosyntactically annotated "1984" "1984" corpuscorpus
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
1. Morphosyntactic 1. Morphosyntactic specificationsspecifications
Based on EAGLES / MULTEXTBased on EAGLES / MULTEXT Define PoS, their attributes and valuesDefine PoS, their attributes and values The specs are a document containing: The specs are a document containing:
– introductionintroduction– common tablescommon tables– language particular sectionslanguage particular sections
Written in LaTeX Written in LaTeX PDF & HTML PDF & HTML Derived XML/TEI encoding as feature Derived XML/TEI encoding as feature
structuresstructures
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
Example common tableExample common table
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
Example Example language specific language specific sectionsection
tabletable(shows only (shows only categories actually categories actually used)used)
notesnotes
combinationscombinations
lexiconlexicon
for Slovene (FIDA):for Slovene (FIDA):localisation of localisation of category namescategory names
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
Morphosyntactic Morphosyntactic ComplexityComplexity
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
2. The lexica2. The lexica
Medium size morphosyntactic lexicaMedium size morphosyntactic lexica Languages: English, Romanian, Slovene, Languages: English, Romanian, Slovene,
Czech, Bulgarian, Estonian, Hungarian, Czech, Bulgarian, Estonian, Hungarian, Serbian.Serbian.
~ all word-forms of cca 15.000 lemmas~ all word-forms of cca 15.000 lemmas Lexical entry is composed of three fields: Lexical entry is composed of three fields:
– the word-form: the inflected form of the wordthe word-form: the inflected form of the word– the lemma: the base-form of the wordthe lemma: the base-form of the word– the morphosyntactic description (MSD)the morphosyntactic description (MSD)
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
Example: Slovene Example: Slovene lexicon lexicon abeced abeced abeceda abeceda Ncfdg Ncfdg abeced abeced abeceda abeceda Ncfpg Ncfpg abeceda abeceda = = Ncfsn Ncfsn abecedah abecedah abeceda abeceda Ncfdl Ncfdl abecedah abecedah abeceda abeceda Ncfpl Ncfpl abecedam abecedam abeceda abeceda Ncfpd Ncfpd abecedama abecedama abeceda abeceda Ncfdd Ncfdd abecedama abecedama abeceda abeceda Ncfdi Ncfdi abecedami abecedami abeceda abeceda Ncfpi Ncfpi abecede abecede abeceda abeceda Ncfpa Ncfpa abecede abecede abeceda abeceda Ncfpn Ncfpn abecede abecede abeceda abeceda Ncfsg Ncfsg abecedi abecedi abeceda abeceda Ncfda Ncfda abecedi abecedi abeceda abeceda Ncfdn Ncfdn ……
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
Lexicon sizesLexicon sizes
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
3. The “1984” corpus3. The “1984” corpus
Languages: En, Ro, Sl, Cs, Et, Hu, Sr, (Bg, Ru, (Mk, Hr, Tr,…))Languages: En, Ro, Sl, Cs, Et, Hu, Sr, (Bg, Ru, (Mk, Hr, Tr,…)) Structuraly annotated Structuraly annotated Sentence aligned with EnglishSentence aligned with English Words annotated with lemma and MSDWords annotated with lemma and MSD Encoded in TEI P4 (XML)Encoded in TEI P4 (XML)
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
Example linguistic Example linguistic encodingencoding<text id="Osl." lang="sl"> <text id="Osl." lang="sl"> <body> <body> <div type="part" id="Osl.1"> <div type="part" id="Osl.1"> <div type="chapter" id="Osl.1.2"> <div type="chapter" id="Osl.1.2"> <p id="Osl.1.2.2"> <p id="Osl.1.2.2"> <s id="<s id="Osl.1.2.2.1Osl.1.2.2.1"> "> <w lemma="<w lemma="bitibiti" ana="" ana="Vcps-smaVcps-sma">">BilBil</w> </w> <w lemma="<w lemma="bitibiti" ana="" ana="Vcip3s--nVcip3s--n">">jeje</w> </w> <w lemma="<w lemma="jasenjasen" ana="" ana="AfpmsnnAfpmsnn">">jasenjasen</w> </w> <c><c>,,</c> </c> <w lemma="<w lemma="mrzelmrzel" ana="" ana="AfpmsnnAfpmsnn">">mrzelmrzel</w> </w> <w lemma="<w lemma="aprilskiaprilski" ana="" ana="AopmsnAopmsn">">aprilskiaprilski</w> </w> <w lemma="<w lemma="dandan" ana="" ana="NcmsnNcmsn">">dandan</w> </w> <w lemma="<w lemma="inin" ana="" ana="CcsCcs">">inin</w> </w> <w lemma="<w lemma="uraura" ana="" ana="NcfpnNcfpn">">ureure</w> </w> <w lemma="<w lemma="bitibiti" ana="" ana="Vcip3p--nVcip3p--n">">soso</w> </w> <w lemma="<w lemma="bitibiti" ana="" ana="Vmps-pfaVmps-pfa">">bilebile</w> </w> <w lemma="<w lemma="trinajsttrinajst" ana="" ana="McnpnlMcnpnl">">trinajsttrinajst</w> </w> <c><c>..</c> </c> </s> </s> … …
Sentence alignmentSentence alignment & &
Context disambiguated Context disambiguated
lemmaslemmas and and MSDsMSDs
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
Quantifying the corpusQuantifying the corpus
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
Utility of MULTEXT-Utility of MULTEXT-East LRsEast LRs
Specifications became, for some, the “national” Specifications became, for some, the “national” standardstandard
Training/testing dataset for HLT development:Training/testing dataset for HLT development:PoS taggers, lemmatizers, lexicon extractors, ILPPoS taggers, lemmatizers, lexicon extractors, ILP
A base dataset for further annotation and experiments:A base dataset for further annotation and experiments:– Word-sense disambiguationWord-sense disambiguation– WordNet development and evaluationWordNet development and evaluation– Syntactic parser inductionSyntactic parser induction
Teaching aid in HLT coursesTeaching aid in HLT courses ~ 100 registered users~ 100 registered users As a BLARK “best practice” for new languages: As a BLARK “best practice” for new languages:
Resian, Croatian, Macedonian, PersianResian, Croatian, Macedonian, Persian
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
LRs @ JSILRs @ JSI http://nl.ijs.si/nl.html#Resourcehttp://nl.ijs.si/nl.html#Resource
AAlso ours: VAYNA, GORE, sloWNet lso ours: VAYNA, GORE, sloWNet Contributors to: FIDA, DSI, FDV, JRC-ACQUISContributors to: FIDA, DSI, FDV, JRC-ACQUIS
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
Overview of Slovene LRs and services Overview of Slovene LRs and services @ Slovenian Language Technologies @ Slovenian Language Technologies SocietySocietyhttp://nl.ijs.si/sdjt/http://nl.ijs.si/sdjt/
GralisGralis2006-05-092006-05-09
Tomaž ErjavecTomaž ErjavecDept. of Knowledge Technologies, JoDept. of Knowledge Technologies, Jožžef Stefan Instituteef Stefan Institute
Thank you!Thank you!
Top Related