Giuseppe Riccardi, Marco Ronchetti

31
1 Giuseppe Riccardi, Marco Ronchetti University of Trento

Transcript of Giuseppe Riccardi, Marco Ronchetti

Page 1: Giuseppe Riccardi, Marco Ronchetti

1

Giuseppe Riccardi, Marco RonchettiUniversity of Trento

Page 2: Giuseppe Riccardi, Marco Ronchetti

Giuseppe Riccardi University of Trento 2

Outline

�Searching Information

�Next Generation Search Interfaces

�Needle

�E-learning Application

�Multimedia Docs Indexing, Search and Presentation

�Demo

�Conclusion

Page 3: Giuseppe Riccardi, Marco Ronchetti

Giuseppe Riccardi University of Trento 3

Searching Information(Past)

�Query model

�State-of-the-art -> bag of words

�Loyal Users

�Indexing

�Web pages, pdf, doc...

�State-of-the-Art ->index data structures

�Evaluation

�$$$

�Increasing the quality of the retrieved docs through

�Ranking

�Crawling

Page 4: Giuseppe Riccardi, Marco Ronchetti

Giuseppe Riccardi University of Trento 4

“Where is Mississippi”(11/30/2006)

Page 5: Giuseppe Riccardi, Marco Ronchetti

Giuseppe Riccardi University of Trento 5

Outline“Microsoft Reorg a bulwark against Google”

(11/30/2006)

Page 6: Giuseppe Riccardi, Marco Ronchetti

Giuseppe Riccardi University of Trento 6

Searching Information(Present)

�Meta-Engines

�Docs are retrieved by de-facto standard search engines

�Query-Answer pairs extraction (e.g. ask.com)

�Docs are

�Clustered (many-to-one) (e.g. vivissimo.com)

�Visualized via multiple views (many-to-many)

Page 7: Giuseppe Riccardi, Marco Ronchetti

Giuseppe Riccardi University of Trento 7

Outline“Microsoft Reorg a bulwark against Google”

(11/30/2006)

Page 8: Giuseppe Riccardi, Marco Ronchetti

Giuseppe Riccardi University of Trento 8

Companies“Microsoft Reorg a bulwark against Google”

Page 9: Giuseppe Riccardi, Marco Ronchetti

Giuseppe Riccardi University of Trento 9

Topics“Microsoft Reorg a bulwark against Google”

Page 10: Giuseppe Riccardi, Marco Ronchetti

Giuseppe Riccardi University of Trento 10

Stories“Microsoft Reorg a bulwark against Google”

Page 11: Giuseppe Riccardi, Marco Ronchetti

Giuseppe Riccardi University of Trento 11

Read/Center Story“Microsoft Reorg a bulwark against Google”

Page 12: Giuseppe Riccardi, Marco Ronchetti

Giuseppe Riccardi University of Trento 12

“Microsoft Reorg a bulwark against Google”

Page 13: Giuseppe Riccardi, Marco Ronchetti

Giuseppe Riccardi University of Trento 13

What is next?

�Typical tasks of web users

�Transactional

�Navigational

�Informational

�Task-Driven search

�Information search is part of a given task

�Business Intelligence (e.g. Decision Making)

�E-Learning ( e.g. Student E-Tutoring)

�Vertical Search Engines

Page 14: Giuseppe Riccardi, Marco Ronchetti

Giuseppe Riccardi University of Trento 14

Next Generation Information Search Interfaces

� Search Multimedia Documents� Indexing, Ranking� Large Scale � Real-time

� Query� Multimodal Input (text, gesture, speech)

� Vertical Engine� Limited Domain (e.g. business, education)� Structured & Annotated Content (not free!)� Certification of the results

� “What is the success rate of medication X?”� Results

� Multimedia Presentation� Bandwitdth

Page 15: Giuseppe Riccardi, Marco Ronchetti

Giuseppe Riccardi University of Trento 15

Needle Research Program

� Search Multimedia Content� Audio, Video, Metadata streams

� Indexing� Video� Automatic Speech Recognition (Unlimited-CSR)� Semantic Segmentation � Topic segmentation� Domain Ontology

� Input� Natural Language Query (Spoken or Text or

Multimodal)� Presentation

� Multimedia Presentation� Usability

Page 16: Giuseppe Riccardi, Marco Ronchetti

Giuseppe Riccardi University of Trento 16

Large amount of different kinds of educational resources.

Video lecturesBooksSlidesInteractive whiteboard streams

Goal : Information Search Interfaces for e-Learning

Needle: E-Learning Domain

Page 17: Giuseppe Riccardi, Marco Ronchetti

Giuseppe Riccardi University of Trento 17

Where is the content?Domain: Education Domain

� MSRI� Math-CS research/advanced topics (skewed)� Video/Audio lectures� Presentation vgs from video close shots

� MIT� Courseware (syllabus, lecture notes)� Video/Audio lectures� Wide range of topics

� University of Trento� Video/Audio lectures� Synch powerpoint presentation-video-audio� Skewed topics (CS & other)

Page 18: Giuseppe Riccardi, Marco Ronchetti

Giuseppe Riccardi University of Trento 18

System Components

� LODE� Content Creation � video lecture acquisition and synchronization with the

learning materials, and of their reproduction in a web browser.

� Needle : Interface for searching though the multimedia content and generating the multimedia documents.

Page 19: Giuseppe Riccardi, Marco Ronchetti

Giuseppe Riccardi University of Trento 19

LODE

LODE is a software for low-cost acquisition of lectures – no special requirements for the end user.

Streaming or Download

Off-Line(DVD)

•Good quality audio and video +•Images of the slides projected in class •Tools for navigating the lecture(by section title, by other indexes or through a time-slider) •Annotating video lectures withdocuments . •One DVD for a 50-hours class (MP4).

Page 20: Giuseppe Riccardi, Marco Ronchetti

Giuseppe Riccardi University of Trento 20

System Architecture

Transcripts

Slides

InteractiveWhiteboard

Forums

MeetingRecordings

Audio

Video

Page 21: Giuseppe Riccardi, Marco Ronchetti

Giuseppe Riccardi University of Trento 21

DB Structure

5 main entities: Actor ( the

teacher), Event (a lecture) , Series (a course), View (part

of a document) and Document (a MS PowerPoint presentation).

Page 22: Giuseppe Riccardi, Marco Ronchetti

Giuseppe Riccardi University of Trento 22

Lecture Topics

86%

8% 6%

Computer Science

Meteorology

Sociology

Multimedia Database(2003-Present)

Languages

24%

76%

English

Italian

Page 23: Giuseppe Riccardi, Marco Ronchetti

Giuseppe Riccardi University of Trento 23

Database Statistics

277198634

416

201Sociologia del Turismo [2003]

401Programmazione 2 [2003]

401Lab. Programmazione 2 [2003]

151Lab. Sistemi Operativi [2004]

151Lab. di Algoritmi e Strutture Dati [2004]

401Ingegneria del Software [2004]

401Architettura degli Elaboratori [2004]

33Science Faculty Seminars [2005]

401Programmazione 2 [2006]

24862Corso Meteorologia [2005]

401Machine Learning [2006]

401Distribuited Systems - Design [2005]

10SSSW05 [2005]

10SSSW06 [2006]

10WeeNet Summer School [2006]

HTL06

hoursspeakershoursspeakers

ItalianEnglish

Page 24: Giuseppe Riccardi, Marco Ronchetti

Giuseppe Riccardi University of Trento 24

Utterance Length Statistics

Words

Fre

quen

cy

Min : 1Max : 78Average : 19,9

Page 25: Giuseppe Riccardi, Marco Ronchetti

Giuseppe Riccardi University of Trento 25

Multimedia Indexing(speech driven)

“operatore new”Speech

Video

Slides

time

Page 26: Giuseppe Riccardi, Marco Ronchetti

Giuseppe Riccardi University of Trento 26

Multimedia Indexing(Metadata driven)

“operatore new”Slides

Video

time

Page 27: Giuseppe Riccardi, Marco Ronchetti

Giuseppe Riccardi University of Trento 27

Prototype

� Multimedia data streams (Audio, Video, ASR, Metadata)

� Indexing� Multimedia docs search

� Present & Browse

Page 28: Giuseppe Riccardi, Marco Ronchetti

Giuseppe Riccardi University of Trento 28

DemoWith Angela Fogarolli, Alessandro Bertacco (UNITN)

Page 29: Giuseppe Riccardi, Marco Ronchetti

Giuseppe Riccardi University of Trento 29

E-learning evaluationKirkpatrick’s 4 levels� Level 1 Reactions (Qualitative)

� Did they like it? Was the material relevant to their work?

� Level 2 Learning (Quantitative)� formal to informal testing to team assessment

and self-assessment.� Level 3 Behavior (Qualitative)

� Are the newly acquired skills, knowledge, or attitude being used in the everyday environment of the learner?

� Level 4 Results (Quantitative)� measures the success of the program in terms

that managers and executives can understandKirkpatrick, D.L. (1994).

Evaluating Training Programs: The Four Levels. San Francisco, CA: Berrett-Koehler.

Page 30: Giuseppe Riccardi, Marco Ronchetti

Giuseppe Riccardi University of Trento 30

� Multimedia database with resource of other kind (interactive whiteboard recording, discussion, real and virtual meeting registration).

� Ontologies linking to offer knowledge-supported search.� Training of Unlimited-ASR

� Portable (domains)� Spoken Language Understanding (Query/Doc)� Semantic indexing� Evaluation

� E-learning domain� Content Creation

� Inter-University collaborative efforts

Future Research

Page 31: Giuseppe Riccardi, Marco Ronchetti

Giuseppe Riccardi University of Trento 31

� Information Search� Past & Present

� Next Generation Information Search� Needle

� Multimedia Documents� Indexing� Search� Presentation

� Content Creation� Inter-University collaborative efforts

Conclusion