Speech & NLP (Fall 2014): Kuipers' Spatial Semantic Hierarchy (SSH)
-
Upload
vladimir-kulyukin -
Category
Science
-
view
248 -
download
0
Transcript of Speech & NLP (Fall 2014): Kuipers' Spatial Semantic Hierarchy (SSH)
Speech & NLP
Kuipers’ Spatial Semantic Hierarchy
www.vkedco.blogspot.com
Vladimir Kulyukin
Department of Computer Science
Utah State University
Outline
Spatial Semantic Hierarchy (SSH)
SSH Levels
Deduction, Induction, Abduction
Spatial Semantic Hierarchy
Background
Spatial Semantic Hierarchy (SSH) was
discovered and developed by Benjamin
Kuipers and his students
SSH is a model of knowledge of
environments that consists of multiple
interacting representations
These representations are quantitative and
qualitative
Types of Spatial Knowledge
Large-scale space – space that exceeds the
agent’s sensory horizon
Visual space – immediate environment that the
agent can explore by gaze
Graphical space – spatial layouts and relations
among symbols expressed graphically (e.g., on
paper or tablet)
The term cognitive map refers to human
knowledge of large-scale space
Why Study Spatial Knowledge?
Spatial knowledge is a fundamental type of
commonsense knowledge
We use spatial knowledge daily to navigate
We use spatial knowledge to represent concepts
graphically
We use spatial knowledge to organize virtual
worlds (e.g., the concept of computer desktop is
a virtual metaphor of a real desktop)
SSH Levels
SSH Levels
SSH consists of five levels: 1) sensory; 2)
control; 3) causal; 4) topological; and 5)
metrical
Kuipers organizes these levels in a lattice
of nodes where each node corresponds to a
representation with its own ontology (aka
conceptualization in the terminology of this
course), axioms, and inference rules
Relations among SSH Levels
Level X is dependent on level Y when X
“presupposes, or is defined in terms of, or
is inferred from, knowledge in the
representation at” Y
Level X receives information from level Y
when knowledge stored and/or computed at
Y is accessible to X
Organization of SSH Lattice Nodes
SSH lattice nodes are organized along two
dimensions: qualitative vs. quantitative
(horizontal) and ontological (vertical)
The horizontal level indicates that spatial
knowledge can be either qualitative or
quantitative
The vertical level organizes nodes in terms of
ontological dependencies
Sensory Level
Sensory level is the interface to the agent’s
sensorimotor system
Sensors can be continuous (camera, laser,
sonar, etc.) or discrete (digital compass,
odometer, RFID reader, Wi-Fi receiver, etc.)
Distinction continuous vs. discrete is
arbitrary because a continuous sensor’s
output can be made discrete
Control Level
Control level is a set of control laws that
bind the agent and its environment in a
dynamic system within some uniform
segment of that environment
Each control law has conditions for its
appropriateness and termination
There are two broad classes of control
laws: hill-climbing & trajectory following
Causal Level
Causal level discretizes the continuous
world and the agent’s actions in terms of
sensory views, actions, and relations among
views and actions
This is similar to the STRIPS action
semantics of the PDDL operators that we
have investigated before
Plans constructed at the causal level are
executed on the control level
Topological Level
Topological level describes the world in terms of
places, paths, regions, and their connectivity, order,
and containment
SSH makes a claim that a topological network map is
more effective for planning that the flat causal model
SSH makes another claim that “the ability to plan and
act is not dependent on the availability of quantitative
spatial knowledge”
The latter claim is very interesting, because, if it is
true, planning can be done without direct contact with
the world
Metrical Level
Metrical level is a global geometric map of the
environment with a single frame of reference
Kuipers appears to suggest that quantitative
geometric information is also present at each SSH
level: local analog maps at control level; action
magnitudes at causal level; headings and
distances at topological level
Smaller local frames can be linked into a global
frame of reference
Closer Look at SSH Levels
Control Level
Uniform Segments
Environment is divided into uniform segments each
of which has its own control law (e.g., follow middle
line, follow right wall, follow left wall, etc.)
The agent is assumed to use only sensory input to
execute control laws
The agent receives a continuous time series of
sensory values and outputs a continuous time series of
motor signals
A control law is a relation b/w sensory inputs and
motor outputs
Control Laws as Differential Equations
The agent, the environment, and a given
control law is described as a dynamic system
This system can be modeled by a system of
differential equation
The system’s behavior is described by a
solution to that system
Hill-Climbing vs. Trajectory Following
A hill-climbing control law brings the agent into
a locally distinctive state
A hill-climbing control law terminates when a
distinctiveness measure (e.g., distance) reaches
a local maximum
A trajectory-following control law brings the
agent from one distinctive state to the
neighborhood of the next
Low-Level Details of Control Laws
In Sections 2.1, 2.2., 2.3, 2.4, and 2.5 of his
article “The Spatial Semantic Hierarchy,” Kuipers
discusses low-level details of control laws
These are fascinating and insightful but are
peripheral to the NLP problems related to the
extraction of spatial knowledge from NL texts
Guarantees of Control Level
Guarantees of control level are more
interesting to us, because they specify what we
can assume about the agent’s physical abilities
There are two broad guarantees:
1) After a hill-climbing law terminates at a
distinctive state, at least one trajectory
following law is applicable (no dead ends)
2) After a trajectory-following law terminates
at least one hill-climbing law is applicable
Causal Level
Action Abstraction Schema
A sequence of control laws can be abstracted into an
action that starts at a given sensory view V and ends
at another sensory view V’
This abstraction is called the schema <V, A, V’>
Causal level, unlike control level, consists of
discrete states
The agent performs a sequence of discrete actions
that result in state transitions
View
A view is a description of the sensory input
vector s(t) = [s1(t), …, sn(t)]
My guess is that this definition is deliberately
vague to give the knowledge engineer a lot of
elbow space to play with various representations
For example, a description can specify a Wi-Fi
cluster or the color histogram of an image take
by the robot’s camera
Actions, Schemas, Routines
An action is a sequence of one or more control
laws
An action is initiated at a locally distinctive
state specified by one view description and
terminates at another locally distinctive state
specified by another view description
Actions are specified by schemas <V, A, V’>
A routine is a sequence of schemas indexed by
initial view
Declarative & Procedural Schema Interpretations
The first interpretation (declarative) means that a view V is
observed in situation s0 and the view V’ holds in the result of
executing action A in situation s0
The second interpretation (procedural) means that if a
view V is observed, then execute action A right away (now)
ℎ𝑜𝑙𝑑𝑠 𝑉, 𝑠0 && ℎ𝑜𝑙𝑑𝑠 𝑉′, 𝑟𝑒𝑠𝑢𝑙𝑡 𝐴, 𝑠0
ℎ𝑜𝑙𝑑𝑠 𝑉, 𝑛𝑜𝑤 ⇒ 𝑑𝑜 𝐴, 𝑛𝑜𝑤
Turns & Travels
At the causal level, all actions are classified into two categories: turn and
travel
This categorization may be too restrictive: SSH article states that one can
construct environments for which these categories break down
A claim is made, however, that these actions are sufficient for office spaces and
street networks
A turn is an action that leaves the agent in the same place; a travel is an action
that takes the agent from one place to another
𝑇𝑟𝑎𝑣𝑒𝑙 δ, ΔΘ , 𝑤ℎ𝑒𝑟𝑒 δ, ΔΘ 𝑎𝑟𝑒 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒 𝑎𝑛𝑑 𝑜𝑟𝑖𝑒𝑛𝑡𝑎𝑡𝑖𝑜𝑛 𝑐ℎ𝑎𝑛𝑔𝑒, 𝑟𝑒𝑝𝑠𝑒𝑐𝑡𝑖𝑣𝑒𝑙𝑦
𝑇𝑢𝑟𝑛 α , 𝑤ℎ𝑒𝑟𝑒 α 𝑖𝑠 𝑎𝑛 𝑟𝑜𝑡𝑎𝑡𝑖𝑜𝑛 𝑎𝑛𝑔𝑙𝑒
Routines
A routine is a sequence of schemas
A routine is indexed by its initial view, i.e., the view of
the 1st schema
A routine represents a behavior that moves the agent from
one distinctive state to another distinctive state
Figure 8 in the article “The Spatial Semantic Hierarchy”
seems to imply that distinctive states are described by
views
A routine can be viewed as a description of a behavior or
as a procedure for executing that behavior in the world
Complete & Partial Schemas in Routines
A schema of the form <V, A, V’> is complete
A schema of the form <V, A, nil> is partial
If a routine is defined in terms of complete
schemas, the agent can both execute and
describe it
If a routine is defined in terms of partial
schemas, the agent can only execute it in the
world but not describe it
Complete & Adequate Routines
Suppose 𝑽𝟎𝑨𝟎𝑽𝟏𝑨𝟏𝑽𝟐𝑨𝟐, … , 𝑽𝒏−𝟏𝑨𝒏−𝟏, 𝑽𝒏 is an
alternating sequence of views and actions
A routine R is complete from 𝑽𝟎 to 𝑽𝒏 if R has
a complete schema < 𝑽𝒊, 𝑨𝒊, 𝑽𝒊+𝟏 > for each
0 < 𝑖 < 𝑛
A routine R is adequate from 𝑽𝟎 to 𝑽𝒏 if it
contains either a complete schema
< 𝑽𝒊, 𝑨𝒊, 𝑽𝒊+𝟏 > or a partial schema
< 𝑽𝒊, 𝑨𝒊, 𝑽𝒊+𝟏 > for each 0 < 𝑖 < 𝑛
Complete & Adequate Routines
Adequate routines support situated action, i.e., behavior
that can be executed only when the agent is placed
(“situated”) in the world
Complete routines support both situated action and
cognitive manipulation, i.e., the agent can both execute
and describe the behavior
Example: if the agent can only navigate a route, the
navigation routine is adequate; if the agent can both
navigate and describe it, the navigation routine is complete
Deduction, Induction, & Abduction
Deduction
Given the implication A B and the truth of A, one
can infer B
Example:
- Implication: if the agent’s Wi-Fi classifier
classifies the input signal at cluster C, the agent
is at location L
- Truth of antecedent: the agent’s Wi-Fi classifier
classifies the input signal at cluster C
- Inference: the agent is at cluster location L
Induction
Infer the consequent B from the antecedent A if B
has so far always followed A
Example: you observe a swan and note that its
color is white; you observe another swan and note
that its color is white; you observe another n swans
and they are all white; you conclude that if a bird is
a swan (A), then its color is white (B)
Inductive inferences are always congruent with
agents’ experience but are not always be true: there
are, in fact, black swans
Abduction
Infer A as a possible antecedent of the observed
consequent B
In the literature on abduction, A is sometimes
referred to as a possible explanation of B
Example: you observe that your neighbor’s lot is
wet on a dry day (B) and infer that your neighbor has
watered the lawn (A); chances are your inference is
true but there is a chance that it is wrong: there
may, for example, be a leaking sprinkler
Topological Level
Elements of Topological Level
A place is a zero-dimensional part of the environment
A path is a one-dimensional subspace (e.g., a street in a
city)
There are two directions along a path: dir = +1 and dir =-1
These directions may be loosely interpreted and forward
and backward
A travel action moves the agent from one place on a path
to another place on a path
A turn action keeps the agent in the same place
A region is a 2D subset of the environment
A region may be abstracted into a place
Regions, Boundaries, Abstractions
Regions are sets of places
Places are grouped into regions because
1) they are located on one side of a specific
boundary;
2) they share a 2D metrical frame;
3) they are abstracted to the same place in a
higher-level topological map
Topological Abduction
Given a sequence of views and actions, the
agent infers places, paths, and regions by
abduction
The agent postulates a minimal set of places,
paths, and regions consistent with the views and
actions
Abduced elements may or may not be sufficient
to explain the sequence of observed views and
actions
Topological Relations
𝑎𝑡 𝑣𝑖𝑒𝑤, 𝑝𝑙𝑎𝑐𝑒 − 𝑣𝑖𝑒𝑤 is seen at 𝑝𝑙𝑎𝑐𝑒
𝑎𝑙𝑜𝑛𝑔 𝑣𝑖𝑒𝑤, 𝑝𝑎𝑡ℎ, 𝑑𝑖𝑟 − 𝑣𝑖𝑒𝑤 is seen along 𝑝𝑎𝑡ℎ in direction 𝑑𝑖𝑟
𝑜𝑛 𝑝𝑙𝑎𝑐𝑒, 𝑝𝑎𝑡ℎ − 𝑝𝑙𝑎𝑐𝑒 is on 𝑝𝑎𝑡ℎ
𝑜𝑟𝑑𝑒𝑟 𝑝𝑎𝑡ℎ, 𝑝𝑙𝑎𝑐𝑒1, 𝑝𝑙𝑎𝑐𝑒2, 𝑑𝑖𝑟 − the order on path from 𝑝𝑙𝑎𝑐𝑒1 to 𝑝𝑙𝑎𝑐𝑒2 is 𝑑𝑖𝑟
𝑟𝑖𝑔ℎ𝑡𝑂𝑓 𝑝𝑎𝑡ℎ, 𝑑𝑖𝑟, 𝑟𝑒𝑔𝑖𝑜𝑛 − 𝑝𝑎𝑡ℎ, facing direction 𝑑𝑖𝑟, has 𝑟𝑒𝑔𝑖𝑜𝑛 on its right
𝑙𝑒𝑓𝑡𝑂𝑓 𝑝𝑎𝑡ℎ, 𝑑𝑖𝑟, 𝑟𝑒𝑔𝑖𝑜𝑛 − 𝑝𝑎𝑡ℎ, facing direction 𝑑𝑖𝑟, has 𝑟𝑒𝑔𝑖𝑜𝑛 on its left
𝑖𝑛 𝑝𝑙𝑎𝑐𝑒, 𝑟𝑒𝑔𝑖𝑜𝑛 − 𝑝lace is in 𝑟𝑒𝑔𝑖𝑜𝑛
Topological Axioms
𝑜𝑟𝑑𝑒𝑟 𝑝𝑎𝑡ℎ, 𝐴, 𝐵, 𝑑𝑖𝑟 → 𝑜𝑛 𝐴, 𝑝𝑎𝑡ℎ & 𝑜𝑛(𝐵, 𝑝𝑎𝑡ℎ)
┐𝑜𝑟𝑑𝑒𝑟 𝑝𝑎𝑡ℎ, 𝐴, 𝐴, 𝑑𝑖𝑟
𝑜𝑟𝑑𝑒𝑟 𝑝𝑎𝑡ℎ, 𝐴, 𝐵, +1 ↔ 𝑜𝑟𝑑𝑒𝑟(𝑝𝑎𝑡ℎ, 𝐵, 𝐴, −1)
𝑜𝑟𝑑𝑒𝑟 𝑝𝑎𝑡ℎ, 𝐴, 𝐵, 𝑑𝑖𝑟 &𝑜𝑟𝑑𝑒𝑟 𝑝𝑎𝑡ℎ, 𝐵, 𝐶, 𝑑𝑖𝑟 → 𝑜𝑟𝑑𝑒𝑟(𝑝𝑎𝑡ℎ, 𝐴, 𝐶, 𝑑𝑖𝑟)
∃𝛼 𝑎𝑙𝑜𝑛𝑔 𝑉, 𝑝𝑎𝑡ℎ, 𝑑𝑖𝑟 & 𝑉, 𝑡𝑢𝑟𝑛, 𝛼 , 𝑉′ & 𝑎𝑙𝑜𝑛𝑔 𝑉′, 𝑝𝑎𝑡ℎ, −𝑑𝑖𝑟
Abducing Places & Paths from Views & Actions
Every view is observed at some place.
∀𝑣𝑖𝑒𝑤 ∃𝑝𝑙𝑎𝑐𝑒 𝑎𝑡 𝑣𝑖𝑒𝑤, 𝑝𝑙𝑎𝑐𝑒
Abducing Places & Paths from Views & Actions
If the agent turns, it does not change its place: 𝑉, 𝑡𝑢𝑟𝑛 𝛼 , 𝑉′ → ∃𝑝𝑙𝑎𝑐𝑒 𝑎𝑡 𝑉, 𝑝𝑙𝑎𝑐𝑒 &𝑎𝑡 𝑉′, 𝑝𝑙𝑎𝑐𝑒
Abducing Places & Paths from Views & Actions
If the agent travels a non-zero distance, then the first and second view exist at two distinct places. 𝑉, 𝑡𝑟𝑎𝑣𝑒𝑙 𝛼 , 𝑉′ & 𝛿 ≠ 0 → ∃𝑝𝑙𝑎𝑐𝑒1, 𝑝𝑙𝑎𝑐𝑒2 𝑝𝑙𝑎𝑐𝑒1 ≠ 𝑝𝑙𝑎𝑐𝑒2 & 𝑎𝑡 𝑉, 𝑝𝑙𝑎𝑐𝑒1 & 𝑎𝑡 𝑉′, 𝑝𝑙𝑎𝑐𝑒2
Abducing Places & Paths from Views & Actions
If the agent travels, then there are a path and direction such that the 1st view V exists on that path in that direction and the 2nd view V’ exists on that path in the same direction. 𝑉, 𝑡𝑟𝑎𝑣𝑒𝑙 𝛼 , 𝑉′ → ∃𝑝𝑎𝑡ℎ, 𝑑𝑖𝑟 𝑎𝑙𝑜𝑛𝑔(𝑉, 𝑝𝑎𝑡ℎ, 𝑑𝑖𝑟) & 𝑎𝑙𝑜𝑛𝑔 𝑉′, 𝑝𝑎𝑡ℎ, 𝑑𝑖𝑟
Abducing Places & Paths from Views & Actions
If the agent travels, then there are a path and a direction with two places such that the first place has the first view, the second place has the second view, both views exist along the path and can be ordered along the same direction. 𝑉, 𝑡𝑟𝑎𝑣𝑒𝑙 δ , 𝑉′ & 𝛿 ≠ 0 → ∃𝑝𝑙𝑎𝑐𝑒1, 𝑝𝑙𝑎𝑐𝑒2, 𝑝𝑎𝑡ℎ, 𝑑𝑖𝑟 𝑎𝑡(𝑉, 𝑝𝑙𝑎𝑐𝑒1) & 𝑎𝑡 𝑉′, 𝑝𝑙𝑎𝑐𝑒2 & 𝑎𝑙𝑜𝑛𝑔(𝑉, 𝑝𝑎𝑡ℎ, 𝑑𝑖𝑟 & 𝑜𝑟𝑑𝑒𝑟(𝑝𝑎𝑡ℎ, 𝑝𝑙𝑎𝑐𝑒1, 𝑝𝑙𝑎𝑐𝑒2, 𝑑𝑖𝑟 )
Upward & Downward Mapping
SSH supports upward and downward mapping
Upward mapping: multiple places at a lower level map to a
single place/region at a higher level
Downward mapping: a single place at a higher level map to
multiple places at a lower level
An abstraction region is the set of places in a more detailed
map abstracted to a particular place
Example: a corridor can be abstracted to a single place in a
higher-level map
Topological Level Uses
Topological level of representation supports
various problem-solving methods
It can be searched as a graph (DFS, BFS)
Distance measures, when and if they are
available, support A* and Dijkstra
Topological level can support goals and sub-
goals
Metrical Level
Global Metrical Mapping
An agent may have a global single frame of reference (2D
or 3D)
Many useful state of knowledge cannot be expressed
numerically in terms of real numbers (orientation error)
Storage of large global frame of references may present
problems
Global frame of reference can be split into a patchwork of
local frame of references
References & Reading Suggestions
B. Kuipers. (2000). “The Spatial Semantic Hierarchy.”
Artificial Intelligence 119, pp. 191-233.
J. Nicholson, V. Kulyukin, D. Coster. (2009). “ShopTalk:
Independent Blind Shopping Through Verbal Route
Directions and Barcode Scans.” The Open Rehabilitation
Journal, ISSN: 1874-9437 Volume 2, 2009, DOI
10.2174/1874943700902010011.