Markku Turunen Tampere Unit for Human-Computer Interaction University of Tampere
description
Transcript of Markku Turunen Tampere Unit for Human-Computer Interaction University of Tampere
TAUCHI – Tampere Unit for Computer-Human Interaction
Markku Turunen
Tampere Unit for Human-Computer InteractionUniversity of Tampere
MUMIN PhD course, Tampere, 18.-22.11.2002
Speech Application Architectures
TAUCHI – Tampere Unit for Computer-Human Interaction
Outline
Topics
• Background
• Architecture types
• Example architectures
• Topics for research
• Jaspis architecture
TAUCHI – Tampere Unit for Computer-Human Interaction
Software architectures 1
Definitions
• “software architecture defines the system in terms of components and interactions between them. Connectors are used to mediate interaction between the components” [Garlan & Shaw, 1994]
• several views can be used to describe different aspects of software architectures: design view, run-time view, module view, logical view, control view, class view, …
• human-computer interaction viewpoint: support for interaction methods and techniques
TAUCHI – Tampere Unit for Computer-Human Interaction
Software architectures 2
Software development tools
• support and tools for the construction of practical applications
• core architecture: basic infrastructure (hub/facilitator, communication libraries, blackboard)
• complete architecture: technology components (ASR, TTS), dialogue manager, database, …
• toolkit: dialogue editor, ASR grammar builder, corpus collection tool, annotation editor, …
TAUCHI – Tampere Unit for Computer-Human Interaction
Speech system components
speech recognition natural languageprocessing
speech synthesis
natural languagegeneration
databasedialogue
management
telephone interface
user
TAUCHI – Tampere Unit for Computer-Human Interaction
Architecture types 1
Pipelines and dialogue management architectures
• pipeline (batch-sequence) architectures– data flow
– one-way interfaces
– fixed processing order
• dialogue manager architectures– function calls
– dialogue manager as controller
– relaxed processing order
TTSTTSASRASR NLUNLU DMDM NLGNLG
TTSTTSASRASR NLUNLU
DMDM
UMUM NLGNLGDBDB
TAUCHI – Tampere Unit for Computer-Human Interaction
client-server and blackboard architectures
• client-server architectures– two-way messages
– hub as coordinator (star topology)
– free processing order
• blackboard (DB) architectures– data events / db operations
– shared information
– free processing order
ISIS
Architecture types 2
TTSTTS
ASRASR
NLUNLUDMDM
UMUM
NLGNLG
DBDB
HUB
TTSTTS
ASRASR
NLUNLUDMDM
UMUMNLGNLG
TAUCHI – Tampere Unit for Computer-Human Interaction
agent architectures
• independent agents– independent agents
– facilitator
– collaborative processing
• compact agents– compact agents
– shared knowledge
– distributed processing
Architecture types 3
TTSTTS
ASRASR
NLUNLUDMDM
UMUM
NLGNLG
DBDB
Facilitator
TTSTTS
ASRASRFacilitator ISIS
PE
DADE PA
UA
DE
IA
DA
PEPA
DA
IA
PANLGNLG
NLUNLU
TAUCHI – Tampere Unit for Computer-Human Interaction
Example architectures 1
GALAXY-II
• MIT / MITRE
• DARPA Communicator reference architecture
• freely available
• HUB and servers
• frames (messages)
• hub scripts route messages
[Seneff et al., 1998]
TAUCHI – Tampere Unit for Computer-Human Interaction
Example architectures 2
Open Agent Architecture
• general agent architecture
• Facilitator as coordinator
• requesters (tasks)
• services (solutions)
• Interagent Communication Language (ICL)
• freely available
• used in speech applications
[Martin et al., 1999]
TAUCHI – Tampere Unit for Computer-Human Interaction
Example architectures 3
WITAS
• dialogue manager agent reacts to events send by other agents
• dialogue manager acts as blackboard
• multimodal inputs are coordinated by DM
• based on OAA
[Lemon et al., 2001]
TAUCHI – Tampere Unit for Computer-Human Interaction
Example architectures 4
MITRE architecture
• dialogue manager as controller
• default processing order
• dialogue manager monitors other components
• dialogue manager is a kind of blackboard
• based on OAA
[Luperfoy et al., 1998]
TAUCHI – Tampere Unit for Computer-Human Interaction
Example architectures 5
TRIPS
• agents, managers and shared databases
• loosely coupled components
• no dialogue manager
• KQML messages
• facilitator does not contain control logic
[Allen et al., 2001]
TAUCHI – Tampere Unit for Computer-Human Interaction
Current vs. new application areas
Traditional speech applications Future speech applications
single user multiple users
desktop and telephonyoffice and home environments, mobile
settings
Single deterministic dialogueopen-ended, dynamically constructed
concurrent dialogues
active user (pro)active computer
centralized dialogue management distributed interaction management
TAUCHI – Tampere Unit for Computer-Human Interaction
Current vs. new application areas
Traditional speech applications Future speech applications
mostly unimodal mostly multimodal
alternative / exclusive (sequential) modalities
concurrent / synergistic (parallel) modalities
speech, text, graphics speech, sensors, haptics, gestures…
monolingual Multilingual
”natural” interaction methods based on human-human interaction
”innovative” interaction methods based on human-computer interaction
TAUCHI – Tampere Unit for Computer-Human Interaction
Adaptive systems
Need for adaptive applications
• different users: speech-based communication can differ greatly between individual users and situations
– speech is language and culture dependent
– preferences and needs between user groups can be large
• different approaches: people from different backgrounds have different solutions for same problems
• we need interaction methods and architectures that adapt to the different users and situations and support multiple approaches
TAUCHI – Tampere Unit for Computer-Human Interaction
Future example
U: <calls the systems>
S: welcome to the bus timetable system? How may I help you?
U: I want to go to hospital.
S: Which hospital do you mean? There are three hospitals?
U: The northern one.
S: what is your departure location?
U: railway station.
S: bus number 17 leaves 10:12
U: bye. <hangs up>
time passes, no bus is coming
U: <calls the system>
S: welcome to the taxi service.
How may I help you? …
S: <contact the user> You have appointment with your doctor. You need to hurry to catch the bus. It leaves the central station 10:12.
U: thanks.
time passes, bus strike hits the city
S: <contact the user> I’m really sorry, there is no bus coming. The next train leaves seven minutes from now..
U: no, I take a taxi.
S: Please wait… It is coming, please wait in front of the opera building.
U: thanks.
TAUCHI – Tampere Unit for Computer-Human Interaction
Topics for research
Topics for speech systems
• adaptivity: how to support adaptive methods? how to make systems to be adaptive?
• reusability: components, interaction methods, …
• distributed systems: communication protocols, resource sharing, ubiquitous applications
• distributed interaction management: centralized dialogue manager is not suitable for many tasks
• shared knowledge: dialogue, user etc.
• development and evaluation tools: WOZ, corpora, …
TAUCHI – Tampere Unit for Computer-Human Interaction
Jaspis architecture
speech application development framework
• implementation of core architecture with extensions
• designed especially for multilingual and distributed applications
• overall focus on system level adaptivity
• current focus on ubiquitous and multimodal applications
• Java and XML, freely available
• used in several projects and applications
TAUCHI – Tampere Unit for Computer-Human Interaction
Jaspis architecture overview
Info
rma
tio
n M
an
ag
em
en
t <?xml version="1.0"?><add> <route> <tag>preferences</tag> <tag>recognizer</tag> </route> <content> <port>8200</port> </content></add>
<?xml version="1.0"?> <result> success </result>
SocketInterface(server)
InformationManager
Implementation(Java)
InformationManager
Implementation(Perl)
InformationManager
Implementation(etc.)
Information Storage
<?xml version="1.0"?><state> <internal> </internal> <user> </user> <history> </history> <technical> </technical> <external> </external></state>
DialogueManagement
DialogueAgent
DialogueAgent
DialogueAgent
DialogueAgentDialogue
Agent
evaluatorsc apability ev aluatorc ons is tenc y ev aluator...
Dialogue Manager
DialogueAgent
PresentationManagement
Presentation Manager
evaluators
PresentationAgentPresentation
Agent
PresentationAgent
PresentationAgent
PresentationAgent
c apability ev aluatorlanguage ev aluator...
CommunicationManagement
Server Client Device
Client Device
Co
mm
un
ica
tion
Ma
na
ge
r
1:1 n:m
InputEvaluator
InputAgentInput
AgentInputAgent
Engine
Engine
InputEvaluatorInput
Evaluator
1:1
InteractionManager
NGL NLU
DB UM
TAUCHI – Tampere Unit for Computer-Human Interaction
Agents, evaluators and managers
• agents handle various interaction situations, such as speech input interpretations, dialogue decisions and speech output presentations
• evaluators measure how well agents can handle current interaction situation
• managers are used to coordinate agents and evaluators, especially to try to choose the best possible agents to handle each interaction situation
Jaspis components
TAUCHI – Tampere Unit for Computer-Human Interaction
Jaspis interaction management
Interaction ManagerCoordinate Coordinate
Dialogue Model
Dialogue Agents DialogueEvaluators
Evaluate
Dialogue ManagerSelec t Use
Coordinate
Input Model
Input Agents Input Evaluators
Input ManagerUse Use
Coordinate
Presentation Model
PresentationAgents
PresentationEvaluators
Evaluate
Presentation ManagerSelec t Use
Coordinate
Coordinate
TAUCHI – Tampere Unit for Computer-Human Interaction
• information storing method is not fixed (XML, DB)
• information access protocol is defined (DTD)
• Information Managers are used to access the Information Storage – these can be implemented in any language and they can use TCP/IP, XML-RPC or method calls
Information management in Jaspis
Info
rmat
ion
Man
agem
ent
<?xml version="1.0"?><add> <route> <tag>preferences</tag> <tag>recognizer</tag> </route> <content> <port>8200</port> </content></add>
<?xml version="1.0"?> <result> success </result>
SocketInterface(server)
InformationManager
Implementation(Java)
InformationManager
Implementation(Perl)
InformationManager
Implementation(etc.)
Information Storage
<?xml version="1.0"?><state> <internal> </internal> <user> </user> <history> </history> <technical> </technical> <external> </external></state>
TAUCHI – Tampere Unit for Computer-Human Interaction
• presentation agents convert conceptual messages to speech outputs
• for every output the most suitable agent is selected by presentation evaluators
• multiple presentation management modules for different phases
Presentation management in Jaspis
PresentationManagement
Presentation Manager
evaluators
PresentationAgentPresentation
Agent
PresentationAgent
PresentationAgent
PresentationAgent
capability evaluatorlanguage evaluator...
TAUCHI – Tampere Unit for Computer-Human Interaction
• different dialogue agents for different dialogue tasks
• alternative dialogue agents for same dialogue tasks
• dialogue evaluators select dialogue agents
• no single controller (the dialogue manager)
• multiple dialogue management modules
Dialogue management in Jaspis
DialogueManagement
DialogueAgent
DialogueAgent
DialogueAgent
DialogueAgentDialogue
Agent
evaluatorscapability evaluatorcons is tency evaluator...
Dialogue Manager
DialogueAgent
TAUCHI – Tampere Unit for Computer-Human Interaction
• i/o-agents and evaluators handle, combine and coordinate different input streams
• devices – clients – servers – engines
• run-time interpretation and multimodal fusion
• separate module for selection of input modalities
Communication (I/O) management in Jaspis
CommunicationManagement
Server Client Device
Client Device
Com
mu
nication
Mana
ger
1:1 n:m
InputEvaluator
InputAgentInput
AgentInputAgent
Engine
Engine
InputEvaluatorInput
Evaluator
1:1
TAUCHI – Tampere Unit for Computer-Human Interaction
Jaspis extensions
Beyond core infrastructure
• XML-based linguistic information (Annotation Graphs) and log formats (corpus collection, usability tests)
• visualization components (blackboard, interaction)
• speech technology interfaces for common telephony cards, synthesizer and recognizers
• reusable components: error handling, general tasks
• SMS interface, graphical components
• Wizard Of Oz tools
TAUCHI – Tampere Unit for Computer-Human Interaction
Jaspis
Future improvements
• concurrent dialogues and multiple users
• event-based interaction management
TAUCHI – Tampere Unit for Computer-Human Interaction
http://www.cs.uta.fi/hci/spi/
TampereUnit forComputerHumanInteraction
Department of Computer and Information Sciences