An Intelligent Assistant for High-Level Task Understanding
-
Upload
yun-nung-vivian-chen -
Category
Technology
-
view
1.240 -
download
0
Transcript of An Intelligent Assistant for High-Level Task Understanding
![Page 1: An Intelligent Assistant for High-Level Task Understanding](https://reader034.fdocuments.in/reader034/viewer/2022050614/587234e61a28ab102f8b490d/html5/thumbnails/1.jpg)
Ming Sun
Yun-Nung Chen
Alexander I. Rudnicky
School of Computer Science
Carnegie Mellon University
An Intelligent Assistant for High-Level
Task Understanding
![Page 2: An Intelligent Assistant for High-Level Task Understanding](https://reader034.fdocuments.in/reader034/viewer/2022050614/587234e61a28ab102f8b490d/html5/thumbnails/2.jpg)
Many devices, many apps…
2
Rich functionality
is now available
How well can it
support user
activities?
![Page 3: An Intelligent Assistant for High-Level Task Understanding](https://reader034.fdocuments.in/reader034/viewer/2022050614/587234e61a28ab102f8b490d/html5/thumbnails/3.jpg)
3
Book a restaurant!
I can do Yelp!
![Page 4: An Intelligent Assistant for High-Level Task Understanding](https://reader034.fdocuments.in/reader034/viewer/2022050614/587234e61a28ab102f8b490d/html5/thumbnails/4.jpg)
4
Show me bus to China Palace.
I know Maps!
![Page 5: An Intelligent Assistant for High-Level Task Understanding](https://reader034.fdocuments.in/reader034/viewer/2022050614/587234e61a28ab102f8b490d/html5/thumbnails/5.jpg)
5
Tell kids we will have Chinese
tonight.
I will Text!
![Page 6: An Intelligent Assistant for High-Level Task Understanding](https://reader034.fdocuments.in/reader034/viewer/2022050614/587234e61a28ab102f8b490d/html5/thumbnails/6.jpg)
Multi-domain Dialog – Current
SystemSystem: What can I do for you?
User: Could you arrange a dinner for me and my friends?
System: Sorry I don’t understand that. What can I do for you?
User: Can I book a table for three in Täkō downtown for this
Friday?
…
System: OK. What can I do for you next?
User: Show me the bus from here.
System: Where is your destination please?
User: Täkō downtown Pittsburgh.
…
System: What should I do next?
User: Send the bus route to Carrie and Peter.
System: OK. Sending the following message to Carrie and Peter:
“the bus route”.
6
Cannot handle complex intention
Passively support cross-domain dialog
No shared context
![Page 7: An Intelligent Assistant for High-Level Task Understanding](https://reader034.fdocuments.in/reader034/viewer/2022050614/587234e61a28ab102f8b490d/html5/thumbnails/7.jpg)
Multi-domain Dialog – Human AssistantAssistant: What can I do for you?
User: Could you arrange a dinner for me and my friends?
Assistant : What kind of food do you prefer?
User: Mexican?
Assistant : How about Täkō? I can book a table for you.
User: Sounds good! Can I take a bus there?
Assistant : 61 A/B/C/D can take you there. Do you want to send
this to your friends?
User: Great! Send it to Carrie and Peter.
Assistant: OK. The bus route 61 has been sent.
7
Understand complex intentions
Actively support cross-domain dialog
Maintain context
![Page 8: An Intelligent Assistant for High-Level Task Understanding](https://reader034.fdocuments.in/reader034/viewer/2022050614/587234e61a28ab102f8b490d/html5/thumbnails/8.jpg)
Intention UnderstandingYes Dad!
Special Team 1!
8
Plan a dinner!
![Page 9: An Intelligent Assistant for High-Level Task Understanding](https://reader034.fdocuments.in/reader034/viewer/2022050614/587234e61a28ab102f8b490d/html5/thumbnails/9.jpg)
Intention Understanding
Find a restaurant!
9
Yelp!
![Page 10: An Intelligent Assistant for High-Level Task Understanding](https://reader034.fdocuments.in/reader034/viewer/2022050614/587234e61a28ab102f8b490d/html5/thumbnails/10.jpg)
Intention Understanding
Find a bus route!
10
Maps!
![Page 11: An Intelligent Assistant for High-Level Task Understanding](https://reader034.fdocuments.in/reader034/viewer/2022050614/587234e61a28ab102f8b490d/html5/thumbnails/11.jpg)
Intention Understanding
Message Agnes & Edith!
11
Messenger!
![Page 12: An Intelligent Assistant for High-Level Task Understanding](https://reader034.fdocuments.in/reader034/viewer/2022050614/587234e61a28ab102f8b490d/html5/thumbnails/12.jpg)
Intention UnderstandingYes Dad!
Special Team 2!
12
Set up meeting!
![Page 13: An Intelligent Assistant for High-Level Task Understanding](https://reader034.fdocuments.in/reader034/viewer/2022050614/587234e61a28ab102f8b490d/html5/thumbnails/13.jpg)
Approach
Step 1: Observe human user perform multi-
domain tasks
Step 2: Learn to assist at task level
Map an activity description to a set of domain apps
Interact at the task level
13
![Page 14: An Intelligent Assistant for High-Level Task Understanding](https://reader034.fdocuments.in/reader034/viewer/2022050614/587234e61a28ab102f8b490d/html5/thumbnails/14.jpg)
Data Collection 1 – Smart Phone
Wednesday
17:08 – 17:14
CMU
Messenge
rGmail Browser Calendar
Schedule a visit to CMU Lab
Log app invocation + time/date/location
Separate log into episodes if there is 3
minute inactivity
14
![Page 15: An Intelligent Assistant for High-Level Task Understanding](https://reader034.fdocuments.in/reader034/viewer/2022050614/587234e61a28ab102f8b490d/html5/thumbnails/15.jpg)
Data Collection 2 – Wizard-of-Oz
15
Find me an Indian place near
CMU.
Yuva India is
nearby.
Monday
10:08 – 10:15
HomeYelp Maps
Messeng
er
Schedule a lunch with
David.
Music
![Page 16: An Intelligent Assistant for High-Level Task Understanding](https://reader034.fdocuments.in/reader034/viewer/2022050614/587234e61a28ab102f8b490d/html5/thumbnails/16.jpg)
Data Collection 2 – Wizard-of-Oz
16
When is the next bus to school? In 10 min, 61C.
Monday
10:08 – 10:15
HomeYelp Maps
Messeng
er
Schedule a lunch with
David.
Music
![Page 17: An Intelligent Assistant for High-Level Task Understanding](https://reader034.fdocuments.in/reader034/viewer/2022050614/587234e61a28ab102f8b490d/html5/thumbnails/17.jpg)
Data Collection 2 – Wizard-of-Oz
17
Tell David to meet me there in
15 min.
Message sent.
Monday
10:08 – 10:15
HomeYelp Maps
Messeng
er
Schedule a lunch with
David.
Music
![Page 18: An Intelligent Assistant for High-Level Task Understanding](https://reader034.fdocuments.in/reader034/viewer/2022050614/587234e61a28ab102f8b490d/html5/thumbnails/18.jpg)
Corpus
533 real-life multi-domain interactions from 14
real users
12 native English speakers (2 non-)
4 males & 10 females
Mean age: 31
Total # unique apps: 130 (Mean = 19/user)
18
Resources Examples Usage
App sequences Yelp->Maps->Messenger structure/arrangement
Task descriptions “Schedule a lunch with David” nature of the
intention,
language reference
User utterances “Find me an Indian place near
CMU.”
language reference
Meta data Monday, 10:08 – 10:15, Home contexts of the tasks
![Page 19: An Intelligent Assistant for High-Level Task Understanding](https://reader034.fdocuments.in/reader034/viewer/2022050614/587234e61a28ab102f8b490d/html5/thumbnails/19.jpg)
Intention
Realization
Model
• [Yelp, Maps, Uber]
• November, weekday, afternoon,
office
• “Try to arrange evening out”
• [United Airlines, AirBnB,
Calendar]
• September, weekend,
morning, home
• “Planning a trip to California”
• [TripAdvisor, United
Airlines]
• July, weekend, morning,
home
• “I was planning a trip to
Oregon”
“Plan a weekend in
Virginia”
• […, …, …]
• …
•“Shared picture to
Alexis”
Infer:
1) Supportive Domains: United Airlines, TripAdvisor, AirBnB
2) Summarization: “plan trip”
19
![Page 20: An Intelligent Assistant for High-Level Task Understanding](https://reader034.fdocuments.in/reader034/viewer/2022050614/587234e61a28ab102f8b490d/html5/thumbnails/20.jpg)
Find similar past experience
Cluster-based:
K-means clustering on user generated language
Neighbor-based:
KNN
1
2
3
Cluster-
based
Neighbor-
based
20
![Page 21: An Intelligent Assistant for High-Level Task Understanding](https://reader034.fdocuments.in/reader034/viewer/2022050614/587234e61a28ab102f8b490d/html5/thumbnails/21.jpg)
Yelp
Yelp
OpenTabl
e
Yelp
Maps
Maps
Maps
Maps
Messeng
er
Realize domains from past
experience
Representative Sequence
Multi-label Classification
App sequences of similar
experience
(ROVER)
21
![Page 22: An Intelligent Assistant for High-Level Task Understanding](https://reader034.fdocuments.in/reader034/viewer/2022050614/587234e61a28ab102f8b490d/html5/thumbnails/22.jpg)
Some Obstacles to Remove
Language-mismatch
Solution: Query Enrichment (QryEnr)
[“shoot”, “photo”] -> [“shoot”, “take”, “photo”, “picture”]
word2vec, GoogleNews model
App-mismatch
Solution: App Similarity (AppSim)
Functionality space (derived from app descriptions) to
identify apps
Data-driven: doc2vec on app store texts
Rule-based: app package name
Knowledge-driven: Google Play similar app suggestions
22
![Page 23: An Intelligent Assistant for High-Level Task Understanding](https://reader034.fdocuments.in/reader034/viewer/2022050614/587234e61a28ab102f8b490d/html5/thumbnails/23.jpg)
23
Gap between Generic and
Personalized Models
QryEnr, AppSim, QryEnr+AppSim reduce the gap of F1
05
101520253035
![Page 24: An Intelligent Assistant for High-Level Task Understanding](https://reader034.fdocuments.in/reader034/viewer/2022050614/587234e61a28ab102f8b490d/html5/thumbnails/24.jpg)
Compare different AppSim
0
10
20
30
40
50
60
70
Precision Recall F1
Baseline Data Knowledge Rule Combine
24
![Page 25: An Intelligent Assistant for High-Level Task Understanding](https://reader034.fdocuments.in/reader034/viewer/2022050614/587234e61a28ab102f8b490d/html5/thumbnails/25.jpg)
Compare different AppSim
Combining three approaches performs the best
Knowledge-driven and data-driven have low
coverage among (manufacture) apps
Rule-based is better than the other two individual
approaches
25
![Page 26: An Intelligent Assistant for High-Level Task Understanding](https://reader034.fdocuments.in/reader034/viewer/2022050614/587234e61a28ab102f8b490d/html5/thumbnails/26.jpg)
Learning to talk at the task level Techniques:
(Extractive/abstractive) summarization Key phrase extraction [RAKE]
User study: Key phrase extraction + user generated language Ranked list of key phrases + user’s binary judgment
1. solutions online
2. project file
3. Google drive
4. math problems
5. physics homework
6. answers online
…
[descriptions]
Looking up math problems.
Now open a browser.
Go to slader.com.
Doing physics homework.
…
[utterances]
Go to my Google drive.
Look up kinematic equations.
Now open my calculator so I can plug in numbers.
…
26
![Page 27: An Intelligent Assistant for High-Level Task Understanding](https://reader034.fdocuments.in/reader034/viewer/2022050614/587234e61a28ab102f8b490d/html5/thumbnails/27.jpg)
1. solutions online
2. project file
3. Google drive
4. math problems
5. physics homework
6. answers online
…
Learning to talk at the task level Metrics
Mean Reciprocal Rank (MRR)
Result:
MRR ~0.6
understandable verbal reference show up in top 2 of the ranked list
[descriptions]
Looking up math problems.
Now open a browser.
Go to slader.com.
Doing physics homework.
…
[utterances]
Go to my Google drive.
Look up kinematic equations.
Now open my calculator so I can plug in numbers.
…
27
![Page 28: An Intelligent Assistant for High-Level Task Understanding](https://reader034.fdocuments.in/reader034/viewer/2022050614/587234e61a28ab102f8b490d/html5/thumbnails/28.jpg)
Summary
Collected real-life cross-domain interactions from
real users
HELPR: a framework to learn assistance at the
task level
Suggest a set of supportive domains to accomplish
the task
Personalized model > Generic model
The gap can be reduced by QryEnr + AppSim
Generate language reference to communicate
verbally at task level28
![Page 29: An Intelligent Assistant for High-Level Task Understanding](https://reader034.fdocuments.in/reader034/viewer/2022050614/587234e61a28ab102f8b490d/html5/thumbnails/29.jpg)
HELPR demo
Interface
HELPR display
GoogleASR
Android TTS
HELPR server
User models
29
![Page 30: An Intelligent Assistant for High-Level Task Understanding](https://reader034.fdocuments.in/reader034/viewer/2022050614/587234e61a28ab102f8b490d/html5/thumbnails/30.jpg)
Thank you
Questions?
30