TeleMorph & TeleTuras: Bandwidth determined Mobile MultiModal Presentation Student: Anthony J. Solon...
-
Upload
cameron-terry -
Category
Documents
-
view
215 -
download
2
Transcript of TeleMorph & TeleTuras: Bandwidth determined Mobile MultiModal Presentation Student: Anthony J. Solon...
TeleMorph & TeleTuras:Bandwidth determined Mobile MultiModal Presentation
Student: Anthony J. SolonSupervisors: Prof. Paul Mc Kevitt
Kevin Curran
School of Computing and Intelligent Systems
Faculty of EngineeringUniversity of Ulster, Magee
Aims of Research
To develop an architecture, TeleMorph, that dynamically morphs between output modalities depending on available network bandwidth:
Mobile device’s output presentation (unimodal/multimodal) depending on available network bandwidth
network latency and bit error rate mobile device display, available output abilities, memory, CPU user modality preferences, cost incurred user’s cognitive load determined by Cognitive Load Theory (CLT)
Utilise Causal Probabilistic Networks (CPNs) for analysing union of constraints giving optimal multimodal output presentation
Implement TeleTuras, a tourist information guide for city of Derry
Objectives of Research
Receive and interpret questions from userMap questions to multimodal semantic representationMatch multimodal representation to knowledge base to retrieve answerMap answers to multimodal semantic representationMonitor user preference or client side choice variationsQuery bandwidth statusDetect client device constraints and limitationsCombine affect of all constraints imposed on system using CPNsGenerate optimal multimodal presentation based on bandwidth constraint data
Wireless TelecommunicationsGenerations of Mobile networks:
1G - Analog voice service with no data services 2G - Circuit-based, digital networks, capable of data transmission speeds
averaging around 9.6K bps 2.5G (GPRS) - Technology upgrades to 2G, boosting data transmission
speeds to around 56K bps. Allows packet based “always on” connectivity 3G (UMTS) - digital multimedia, different infrastructure required, data
transmission speeds from 144K-384K-2M bps 4G - IP based mobile/wireless networks, Wireless Personal Area Networks
(PANs), ‘anywhere and anytime’ ubiquitous services. Speeds up to 100Mbps
Network-adaptive multimedia models: Transcoding proxies End-to-end approach Combination approach Mobile/Nomadic computing Active networks
Mobile Intelligent MultiMedia Systems
SmartKom (Wahlster, 2003) Mobile, Public, Home/office Saarbrücken, Germany Combines speech, gesture and facial expressions on input & output Integrated trip planning, Internet access, communication applications,
personal organising
VoiceLog (BBN, 2002) BBN technologies in Cambridge, Massachusettes Views/diagrams of military vehicles and direct connection to support Damage identified & ordering of parts using diagrams
MUST (Almeida et al., 2002) MUltimodal multilingual information Services for small mobile
Terminals EURESCOM, Heidelberg, Germany Future multimodal and multilingual services on mobile networks
Please select a parking place from the Map
Intelligent MultiMedia Presentation
Flexibly generate various presentations to meet individual requirements of: 1) users, 2) situations, 3) domains
Intelligent MultiMedia Presentation can be divided into following processes: determination of communicative intent content selection structuring and ordering allocation to particular media realisation in specific media coordination across media layout design
Key research problems: Semantic Representation Fusion, integration & coordination
Semantic representation - represents meaning of media information
Frame-based representations:- CHAMELEON- REA
XML-based representations:- M3L (SmartKom)- MXML (MUST)- SMIL- MPEG-7
Fusion, integration & coordination of modalities Integrating different media in a consistent and coherent manner Multimedia coordination leads to effective integrated multiple media in
output Synchronising modalities
Time threshold between modalities E.g. Input - “What building is this?”, Output - “This is the Millenium forum”
Not synchronised => side effect can be contradiction SMIL modality synchronisation and timing elements
Intelligent MultiMedia Presentation Systems
Automatically generate coordinated intelligent multimedia presentations
User-determined presentation: COMET (Feiner & McKeown, 1991)
COordinated Multimedia Explanation Testbed Generates instructions for maintenance and repair of military radio receiver-
transmitters Coordinates text and 3D graphics of mechanical devices
WIP (Wahlster et al., 1992) Intelligent multimedia authoring system presents instructions for assembling/using/maintaining/repairing devices (e.g.
espresso machines, lawn mowers, modems)
IMPROVISE (Zhou & Feiner, 1998) Graphics generation system constructive/parameterised graphics generation approaches Uses an extensible formalism to represent a visual lexicon for graphics
generation
Intelligent MultiMedia Interfaces & Agents
Intelligent multimedia interfaces Parse integrated input and generate coordinated output XTRA
Interface to an expert system providing tax form assistance Generates & interprets natural language text and pointing
gestures automatically; relies on pre-stored graphics Displays relevant tax form and natural language input/output
panes
Intelligent multimedia agents Embodied Conversational Agents (e.g. MS Agent, REA) Natural human face-face communication - speech, facial expressions,
hand gestures & body stance MS Agent
Set of programmable services for interactive presentation Speech, gesture, audio & text output; speech & haptic input
Project ProposalResearch and implement mobile intelligent multimedia presentation architecture called TeleMorph
Dynamically generates multimedia presentation determined by bandwidth available; also other constraints:
Network latency, bit error rate Mobile device display, available output abilities, memory, CPU user modality preferences, cost incurred Cognitive Load Theory (CLT)
Causal Probabilistic Networks (CPNs) for analysing union of constraints giving optimal multimodal output presentation
Implement TeleTuras, a tourist information guide for city of Derry providing testbed for TeleMorph incorporating:
route planning, maps, spoken presentations, graphics of points of interest & animations
Output modalities used & effectiveness of communication
TeleTuras examples: “Where is the Millenium forum?” “How do I get to the GuildHall?” “What buildings are of interest in this area?” “Is there a Chinese restaurant in this area?”
Architecture of TeleMorph
Data flow of TeleMorph
Media Analysis :
High level :
Comparison of Mobile Intelligent MultiMedia
Systems
Comparison of Intelligent MultiMedia Systems
Software AnalysisClient output:
SMIL media player (InterObject) Java Speech API Markup Language (JSML) Autonomous agent (MSAgent)
Client input: Java Speech API Grammar Format (JSGF) J2ME graphics APIs J2ME networking
Client device status: SysInfo MIDlet - (type/memory/screen/protocols/input abilities/CPU
speed)
TeleMorph server tools: SMIL & MPEG-7 HUGIN (CPNs) JATLite/OAA
Project Schedule
Conclusion
A Mobile Intelligent MultiModal Presentation Architecture called TeleMorph will be developed
Dynamically morphing between output modalities depending on available network bandwidth in conjunction with other relevant constraints
CPNs for analysing union of constraints giving optimal multimodal output presentation
TeleTuras will be used as testbed for TeleMorph
Corpora of questions to test TeleTuras (prospective users/tourists)