Download - Fusion Engines for Input Multimodal Interfaces: a Survey

Transcript
Page 1: Fusion Engines for Input Multimodal Interfaces: a Survey

Special session on Multimodal Fusion

• A survey: Fusion Engines for Multimodal Input• 5 papers

D. Lalanne (Switzerland), L. Nigay (France), P. Palanque (France), P. Robinson (UK), J. Vanderdonckt (Belgium)

1

Page 2: Fusion Engines for Input Multimodal Interfaces: a Survey

Multimodal fusion

• Multimodal fusion for• Perception• Interaction

• Focus on multimodal interaction• 4 papers on multimodal interaction• 1 paper on multimodal perception

(first one)

2

Page 3: Fusion Engines for Input Multimodal Interfaces: a Survey

Input Multimodal Interaction

3

Page 4: Fusion Engines for Input Multimodal Interfaces: a Survey

Input Fusion Engines• Multimodal fusion

• Combining and interpreting data from multiple input modalities

• Usage of input modalities

Combined

Independent

Sequential Parallel

Alternate

Exclusive

Synergistic

Concurrent

4

Page 5: Fusion Engines for Input Multimodal Interfaces: a Survey

Input Fusion Engines

• Combined usage (sequential, parallel) why?

• Natural interaction is multimodal by nature.

• The combination of input modalities increases the bandwidth of the human-computer interaction.

5

Page 6: Fusion Engines for Input Multimodal Interfaces: a Survey

Fusion engines• A very dynamic domain • ˜15 years of contributions: 1993-2008

6

Page 7: Fusion Engines for Input Multimodal Interfaces: a Survey

Input Fusion engines• Some key features

• Multiple and temporal combinations• Types of data and time synchronization

• Probabilistic inputs• Non deterministic inputs

• Robustness• Error handling• Adaptation to context

• Context = (user, environment, platform)

7

Page 8: Fusion Engines for Input Multimodal Interfaces: a Survey

Classification:Fusion engines

8

1980 R. Bolt

“Put that there”

Page 9: Fusion Engines for Input Multimodal Interfaces: a Survey

Classification:Fusion engines

9

1980 R. Bolt

“Put that there”

Cubricon

1989

CARE 1995

Quickset

1997

ICARE 2004 Petshop

2004FAME 2006

Page 10: Fusion Engines for Input Multimodal Interfaces: a Survey

Classification:Fusion engines

10

1980 R. Bolt

“Put that there”

Multiple (up to 255) Input API in

Windows 7 Microsoft

MultiPoint SDK

“Zoom in

here”

UX beats Usability

A gap

Page 11: Fusion Engines for Input Multimodal Interfaces: a Survey

Theories and Contributions over Time

11

Page 12: Fusion Engines for Input Multimodal Interfaces: a Survey

Reference Tool/ language/ programFusion Time Representation

Application types

Notation Type Level Input DevicesAmbiguity Resolution

Quantitative Qualitative

BBolt [4] Put that here system None None Dialog Speech gesture ? N ? Map manipulation

R Wahlster

Erreur ! Source du renvoi introuvable. XTRA None Unification Dialog Keyboard Mouse N Y Map manipulation

Neal [26] CubriconGeneralized Augmented Transition Network Procedural Dialog Speech Mouse Keyboard

Proximity-based N Y Map manipulation

E

Koons [19] No name Parse treeFrame-based Dialog Speech, Eye gaze, Gesture

First solution Y Y 3D World

Nigay [28] Pac-Amodeus Melting PotFrame-based Dialog + low level Speech, Keyboard, Mouse

Context-based resolution Y N Flight Scheduling

Cohen [9] Quickset Feature Structure Unification Dialog Pen VoiceS / G & G / S & N best Y N

Simulation System training

Bellik [3] MEDITOR NoneFrame-based Dialog + low level Speech Mouse

History Buffer Y Y Text Editor

Martin [22] TYCOON Set of processes – Guided Propagation Networks Procedural Dialog Speech Keyboard Mouse

Probability-based resolution Y Y

Edition of graphical user interfaces

Johnston [18] FST Finite State Automata Procedural Dialog Speech penPossible (N best) Y Y Corporate Directory

T & A Krahnstoever

[20] iMap Stream StampedFrame-based Dialog Speech gesture Not given Y N Crisis Management

Dumas [12] HephaisTK XML Typed (SMUIML)Frame-based Dialog Speech Mouse Phidgets First one Y Y Meeting assistants

Holzapfel [17] No Name Typed Feature Structure Unification Dialog Speech gesture N Best list Y N Humanoid Robot

Pfleger [33] PATE XML Typed Unification Dialog Speech pen N Best list Y Y Bathroom design Tool

Milota [25] No Name Multimodal Parse Tree Unification DialogSpeech Mouse keyboard Touchscreen S / G & G /S Y N Graphic Design

Melichar [24] WCIMultimodal Generic Dialog Node Unification Dialog Speech Mouse Keyboard First One ? ? Multimedia DB

Sun [37] PUMPP Matrix Unification Dialog Speech gesture S / G N Y Traffic Control

Bourguet [7] Mengine Finite State machine Procedural Low level Speech Mouse Not given N Y No example

Latoschik [21] No NameTemporal Augmented Transition Network Procedural Dialog Speech gesture

Fuzzy constraints Y Y Virtual reality

Bouchet [5] [6]Mansoux [23]

ICARE(Input/Output) Melting pot

Frame-based Dialog + low level

Speech, Helmet visor HOTAS, Tactile surface, GPS localization, Magnetometer, Mouse, Keyboard

Context-based resolution Y N

Aircraft Cockpit, Authentication, Mobile Augmented Reality systems (Game, Post-it), Augmented Surgery

Navarre [30] Petshop Petri nets Procedural Dialog + low levelSpeech mouse Keyboard Touchscreen *** Y Y Aircraft Cockpit

Flippo [14] No Name Semantic tree Hybrid DialogSpeech Mouse Gaze gesture

Feedback for missing data Y N Collaborative Map

Portillo [34] MIMUSFeature Value Structure (DTAC) Hybrid Dialog Speech Mouse

Knowledgeable agent Y N

Duarte [11] FAME Behavioral Matrix Hybrid Dialog Speech Mouse Keyboard Not given ? ? Digital talking Book12

Page 13: Fusion Engines for Input Multimodal Interfaces: a Survey

ReferenceTool/

language/ program

FusionTime

Representation Application types

Notation Type Level Input DevicesAmbiguity Resolution

Quantitative

Qualitative

B Bolt [4] Put that here system None None Dialog Speech gesture ? N ? Map manipulation

R Wahlster XTRA None Unification Dialog Keyboard Mouse N Y Map manipulation

Neal [26] Cubricon

Generalized Augmented Transition Network Procedural Dialog Speech Mouse Keyboard Proximity-based N Y Map manipulation

E Koons [19] No name Parse tree Frame-based Dialog Speech, Eye gaze, Gesture First solution Y Y 3D World

Nigay [28] Pac-Amodeus Melting Pot Frame-based Dialog + low level Speech, Keyboard, MouseContext-based resolution Y N Flight Scheduling

Cohen [9] Quickset Feature Structure Unification Dialog Pen Voice S / G & G / S & N best Y NSimulation System training

Bellik [3] MEDITOR None Frame-based Dialog + low level Speech Mouse History Buffer Y Y Text Editor

Martin [22] TYCOON

Set of processes – Guided Propagation Networks Procedural Dialog Speech Keyboard Mouse

Probability-based resolution Y Y

Edition of graphical user interfaces

Johnston [18] FST Finite State Automata Procedural Dialog Speech pen Possible (N best) Y Y Corporate Directory

T & A Krahnstoever [20] iMap Stream Stamped Frame-based Dialog Speech gesture Not given Y N Crisis Management

Dumas [12] HephaisTK XML Typed (SMUIML) Frame-based Dialog Speech Mouse Phidgets First one Y Y Meeting assistants

Holzapfel [17] No NameTyped Feature Structure Unification Dialog Speech gesture N Best list Y N Humanoid Robot

Pfleger [33] PATE XML Typed Unification Dialog Speech pen N Best list Y Y Bathroom design Tool

Milota [25] No NameMultimodal Parse Tree Unification Dialog

Speech Mouse keyboard Touchscreen S / G & G /S Y N Graphic Design

Melichar [24] WCIMultimodal Generic Dialog Node Unification Dialog Speech Mouse Keyboard First One ? ? Multimedia DB

Sun [37] PUMPP Matrix Unification Dialog Speech gesture S / G N Y Traffic Control

Bourguet [7] Mengine Finite State machine Procedural Low level Speech Mouse Not given N Y No example

Latoschik [21] No NameTemporal Augmented Transition Network Procedural Dialog Speech gesture Fuzzy constraints Y Y Virtual reality

Bouchet [5] [6]Mansoux [23]

ICARE(Input/Output) Melting pot Frame-based Dialog + low level

Speech, Helmet visor HOTAS, Tactile surface, GPS localization, Magnetometer, Mouse, Keyboard

Context-based resolution Y N

Aircraft Cockpit, Authentication, Mobile Augmented Reality systems (Game, Post-it), Augmented Surgery

Navarre [30] Petshop Petri nets Procedural Dialog + low levelSpeech mouse Keyboard Touchscreen *** Y Y Aircraft Cockpit

Flippo [14] No Name Semantic tree Hybrid Dialog Speech Mouse Gaze gestureFeedback for missing data Y N Collaborative Map

Portillo [34] MIMUSFeature Value Structure (DTAC) Hybrid Dialog Speech Mouse Knowledgeable agent Y N

Duarte [11] FAME Behavioral Matrix Hybrid Dialog Speech Mouse Keyboard Not given ? ? Digital talking Book

13

Page 14: Fusion Engines for Input Multimodal Interfaces: a Survey

Special sessionMultimodal Fusion

• Content• A survey• 5 papers

• Schedule • 10 mn introduction and survey outlook• 15 mn per paper + 5 mn questions• 10 mn for questions on the session

D. Lalanne (Switzerland), L. Nigay (France), P. Palanque (France), P. Robinson (UK), J. Vanderdonckt (Belgium)

Page 15: Fusion Engines for Input Multimodal Interfaces: a Survey

Special sessionMultimodal Fusion

• H. Mendonça: Agent-based fusion• B. Dumas: An evaluation framework to

benchmarck fusion engines• L. Nigay: CARE-based fusion• J. Ladry & P. Palanque: Petri net based formal

description and execution of fusion engines• M. Sezgin: Fusion of speech and facial

expression recognition

Page 16: Fusion Engines for Input Multimodal Interfaces: a Survey

16

QUESTIONS?

Page 17: Fusion Engines for Input Multimodal Interfaces: a Survey

Fusion engines: research agenda

• Performance evaluation• Testbeds, metrics• Identification of interpretation errors• Formal predictive evaluation

• Adaptation to context• Dynamic aspect of adaptation• Reconfigurations

• Engineering aspects• Difficult to develop (toolkit from manufacturers required)• Fusion engine tuning (tuning is the key for interaction

techniques e.g. drag&drop)

17

Page 18: Fusion Engines for Input Multimodal Interfaces: a Survey

Fusion Principles

• Notation: Petri nets based (ICOs)• Type: Procedural only• Level: Dialogue and low level• Input Devices: Speech, mice, keyboard,

touch screen • Ambiguity resolution: inside models • Time representation (Quantitative –

Qualitative): Both• Application Type : Safety Critical,

Aeronautics and Space

18