LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan...

34
LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    225
  • download

    6

Transcript of LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan...

Page 1: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

LTeL - Language Technology for eLearning -

Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane

Evans, Cristina Vertan

Page 2: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

LT4eL - Language Technology for eLearning

• Start date: 1 December 2005• Duration: 30 months• Partners: 12• EU finacing: 1.5 milion Euro• Type project: STREP IST-4

Page 3: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

LT4eL - Partners• Utrecht University, The Netherlands (coordinator)• University of Hamburg, Germany• University “Al.I.Cuza” of Iasi, Romania• University of Lisbon, Portugal• Charles University Prague, Czech Republic• IPP, Bulgarian Academy of Sciences, Bulgaria• University of Tübingen, Germany• ICS, Polish Academy of Sciences, Poland• Zürich University of Applied Sciences Winterthur, Switzerland• University of Malta, Malta• Eidgenössische Hochschule Zürich• Open University, United Kingdom

Page 4: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

LT4eL -Aims• Improve retrieval of learning material• Facilitate construction of user specific

courses• Improve creation of personalized content• Support decentralization of content

management• Allow for multilingual retrieval of content

Page 5: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

LT4eL - Languages

• Bulgarian• Czech• Dutch• German• Maltese• Polish• Portuguese• Romanian• English

Page 6: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

LT4eL- Objectives -1-

• Scientific and Technological Objectives– Creation of an archive of learning objects and

linguistic resources– Integration of language technology resources

and tools in eLearning– Integration of semantic Knowledge in eLearning– Integration of functionalities in open source LMS– Validation of enhanced LMS

Page 7: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

LT4eL- Objectives -2-

• Political objectives– Support multilinguality– Knowledge transfer– Awareness raising– Exploitation of resources– Facilitate access to education

Page 8: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

Tasks

• Creation of an archive of learning objects• Semi-automatic metadata generation driven

by Language Technology resources and NLP tools;

• Enhancing eLearning with semantic knowledge;

• Integration of the new functionalities in the ILIAS Learning Management System;

• Validation of new functionalities in the ILIAS Learning Management System;

• Address Multilinguality

Page 9: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

Lexikon

CZ

CZCZEN

ENCONVERTOR 1

Documents SCORM

Pseudo-Struct.

Basic XML LING. PROCESSOR

Lemmatizer, POS, Partial Parser

CROSSLINGUAL RETRIEVAL

LMS User Profile

Documents SCORM

Pseudo-Struct

Metadata (Keywords)

Ling. Annot XML

Ontology

CONVERTOR 2

Documents HTML

Lexikon

PT

Lexikon

RO

Lexikon

PL

Lexicon

GE

Lexikon

MT

Lexikon

BG

Lexikon

DT

Lexicon

EN

PLPL

GEGE

BGBG

PTPT

MTMT

DTDT

RORO

ENDocuments User

(PDF, DOC, HTML,

SCORM,XML)

REPOSITORY

Glossary

Page 10: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

Creation of a learning objects archive • collection of the learning material (uploads & updates at http://consilr.info.uaic

.ro/uploads_lt4el/ - passwd protected)• IST domains for the LOs:

1. Use of computers in education, with sub-domains: • 1.1 Teaching academic skills, with sub-domains:• 1.1.1 Academic skills• 1.1.2 Relevant computer skills for the above tasks (MS Word, Excel, Power Point,

LaTex, Web pages, XML)• 1.1.3 Basic computer skills (use of computer for beginners) (chats, e-mail, Intenet)• 1.2 e-Learning, e-Marketing• 1.3 The I*Teach document (Leonardo project, http://i-teach.fmi.uni-sofia.bg/)• 1.4 Impact of use of computers in society• 1.5 Studies about use of computers in schools / high schools • 1.6 Impact of e-Learning on education

2. Calimera documents (parallel corpus developed in the Calimera FP5 project, http://www.calimera.org/ )

Page 11: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

Collection of learning materials and linguistic

tools• normalization of the learning material• convertors from html/txt to basic XML format • Inventarization and classification of existing tools (http://consilr.info.

uaic.ro/uploads_lt4el/tools/all.php?) relevant to:– the integration of language technology resources in eLearning

– the integration of semantic knowledge

• Inventarization and classification of existing language resources corpora and frequencies lists: http://consilr.info.uaic.ro/uploads_lt4el/menu/all.php

• lexica: http://www.let.uu.nl/lt4el/wiki/index.php/Lexica_Joint_Table

Page 12: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

Lexikon

CZ

CZCZEN

ENCONVERTOR 1

Documents SCORM

Pseudo-Struct.

Basic XML LING. PROCESSOR

Lemmatizer, POS, Partial Parser

CROSSLINGUAL RETRIEVAL

LMS User Profile

Documents SCORM

Pseudo-Struct

Metadata (Keywords)

Ling. Annot XML

Ontology

CONVERTOR 2

Documents HTML

Lexikon

PT

Lexikon

RO

Lexikon

PL

Lexicon

GE

Lexikon

MT

Lexikon

BG

Lexikon

DT

Lexicon

EN

PLPL

GEGE

BGBG

PTPT

MTMT

DTDT

RORO

ENDocuments User

(PDF, DOC, HTML,

SCORM,XML)

REPOSITORY

Glossary

Page 13: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

Semi-automatic metadata generation with LT and NLP

Aims:• supporting authors in the generation of

metadata for LOs• improving keyword-driven search for LOs• supporting the development of glossaries

for learning material

Page 14: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

Metadata

• metadata are essential to make LOs visible for larger groups of users

• authors are reluctant or not experienced enough to supply them

• NLP tools will help them in that task• the project uses the LOM metadata

schema as a blueprint

Page 15: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

Subtask 1: Identification of keywords

• Good keywords have a typical, non random distribution in and across LOs

• Keywords tend to appear more often at certain places in texts (headings etc.)

• Keywords are often highlighted / emphasised by authors

Page 16: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

Modelling Keywordiness

• Residual Inverse document frequency used to model inter text distribution of KW

• Term burstiness used to model intra text distribution of KW

• Knowledge of text structure used to identify salient regions (e,g, headings)

• Layout features of texts used to identify emphasised words and weight them higher

Page 17: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

Challenges

• Treating multi word keywords (suffix arrays will be used to identify n-grams of arbitrary length)

• Assigning a combined weight which takes into account all the aforementioned factors

• Multilinguality• Evaluation

– manually assigned keywords will be used to measure precision and recall of key word extractor

– Human annotator to judge results from extractor and rate them

Page 18: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

Subtask 2: Identification of definitory contexts

• Empirical approach based on the linguistic annotation of LOs

• Identification of definitory contexts is language specific• Workflow

– Definitory contexts is searched and marked in LOs (manually)

– Local grammars are drafted on the basis of these examples

– The linguistic annotation is used for these grammars– The grammars are applied to new Los– Extraction of definitory context performed by

Lxtransduce (University of Edinburgh - LTG)

Page 19: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

Lexikon

CZ

CZCZEN

ENCONVERTOR 1

Documents SCORM

Pseudo-Struct.

Basic XML LING. PROCESSOR

Lemmatizer, POS, Partial Parser

CROSSLINGUAL RETRIEVAL

LMS User Profile

Documents SCORM

Pseudo-Struct

Metadata (Keywords)

Ling. Annot XML

Ontology

CONVERTOR 2

Documents HTML

Lexikon

PT

Lexikon

RO

Lexikon

PL

Lexicon

GE

Lexikon

MT

Lexikon

BG

Lexikon

DT

Lexicon

EN

PLPL

GEGE

BGBG

PTPT

MTMT

DTDT

RORO

ENDocuments User

(PDF, DOC, HTML,

SCORM,XML)

REPOSITORY

Glossary

Page 20: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

Ontology-based cross-lingual retrieval

• Metadata can also be represented by ontologies• Creation of a domain ontology in the domain of LOs• For consistency reasons we employ also an upper

ontology (DOLCE)• Lexical material in all 9 languages is mapped on the

ontology and on the upper ontology• Ontology will allow for multilingual retrieval of LOs

Page 21: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

Domain Ontology creation

lexicon (vocabulary with natural language definitions);

simple taxonomy; thesaurus (taxonomy plus related-terms);

relational model (unconstrained use of arbitrary relations).

fully axiomatized theory.

Page 22: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

Domain Ontology• terminological dictionary in chosen domain - term in English, - a short definition in English - translation of the term • formalize the definitions to reflect the relations

like is-a, part-of, used-for; • definitions translated in OWL-DL • not achieve a fully axiomatized theory, but

relational model of the domain• connection to the upper ontology will enforce

the inheritance of the axiomatization of the upper ontology to the concepts in the domain ontology.

Page 23: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

Upper Ontology: DOLCE

• the ontology should be constructed on rigorous basis;

• it should be easy to be represented as an ontological language such as RDF or OWL;

• there are domain ontologies constructed with respect to it;

• it can be related to lexicons - either by definition, or by already existing mapping to some lexical resource

Page 24: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

Mapping multilingual resources on the domain

ontology -1-• Trivial for words having exact a correspondent in the ontology

• Problems appear when:1. One word in a language sub-sums two or

more concepts in the ontology 2. One word in a language sub-sums two or

more concepts in an ontology but only in relations with some other concepts

3. One word has a much restrictive meaning not present in the ontology

Page 25: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

Mapping multilingual resources on the domain

ontology -2-• Solution to 1:– Express the lexical items in OWL-DL

expressions: disjunction, conjunctions of classes (give example)

• Solution to 2:– Express the lexical items in OWL-DL using

together with operations on classes also relations between the involved concepts

• Solution to 3:– Insert new concept in the ontology

Page 26: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

Ontology enrichment

• If one word cannot be mapped directly on the ontology look if a similar meaning can be retrieved in some other languages.

• If this seems to be not an isolated case insert the new concept in the ontology.

• In any case assign to each concept a label indicated the languages in which this concept is lexicalised

Page 27: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

Lexikon

CZ

CZCZEN

ENCONVERTOR 1

Documents SCORM

Pseudo-Struct.

Basic XML LING. PROCESSOR

Lemmatizer, POS, Partial Parser

CROSSLINGUAL RETRIEVAL

LMS User Profile

Documents SCORM

Pseudo-Struct

Metadata (Keywords)

Ling. Annot XML

Ontology

CONVERTOR 2

Documents HTML

Lexikon

PT

Lexikon

RO

Lexikon

PL

Lexicon

GE

Lexikon

MT

Lexikon

BG

Lexikon

DT

Lexicon

EN

PLPL

GEGE

BGBG

PTPT

MTMT

DTDT

RORO

ENDocuments User

(PDF, DOC, HTML,

SCORM,XML)

REPOSITORY

Glossary

Page 28: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

Integration in ILIAS

• Integration of LT4eL functionalities for semi-automated metadata generation, definitory context extraction and ontology supported extended data retrieval into a learning management system (prototype based on ILIAS LMS)

• Developing and providing documentation for a standard-technology-based interface between the language technology tools and learning management systems

Page 29: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

Integration of functionalities

ILIAS ServerJava Webserver (Tomcat)

ApplicationLogic

User Interface

KW/DC/OntoJava

Classes/ Data

Webservices

AxisnuSoap

Servlets/JSP

Development Server (CVS)

KW/DC

Code Code/Data

Ontology

Code

ILIAS

Content Portal

LOs

LOs

Evaluatefunctionalities directly

Evaluate functionalitiesin ILIAS

Nightly Updates

Usefunctionalities

throughSOAP

Migration Tool

ThirdPartyTools

Page 30: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

Validation of enhanced LMS.

• Challenge is to answer these questions:– How does this compare with what can already be done

with existing systems? – What added value is there? – What is the educational / pedagogic value of these

functionalities?

• Problem is to evaluate the functionality and separate from issues of usability or unfamiliarity with the LMS platform.How can we expect users to identify any benefit?

Page 31: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

How can we expect users to identify any benefit?

• Present them with tasks to complete using LMS• With no project functionality• With project functionality

– Partial– Full

• Identify potential users– Course Creators– Content Authors or Providers– Teachers– Students

• studying in their own language• studying in a second language

Page 32: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

Create outline User Scenarios

• We define scenarios, in this context, as– a story focused on a user or group of users which

provides information on• the nature of the users,• the goals they wish to achieve and• the context in which the activities will take place.

– They are written in ordinary language, and are therefore understandable to various stakeholders, including users.

– They may also contain different degrees of detail.

Page 33: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

Project Plan

• Preparatory work in place (May 06).• Development functionalities complete

(November 2006). • Integration functionalities in LMS complete

(May 2007)• First cycle integration functionalities in LMS

and their validation complete (November 2007)

• Second cycle integration functionalities in LMS and their validationcomplete (May 2008)

Page 34: LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Dan Cristea,Alex Killing, Diane Evans, Cristina Vertan.

Contact

• www.lt4el.eu• Contact for information:

[email protected]