Speech & NLP (Fall 2014): Kuipers' Spatial Semantic Hierarchy (SSH)

Speech & NLP

Kuipers’ Spatial Semantic Hierarchy

www.vkedco.blogspot.com

Vladimir Kulyukin

Department of Computer Science

Utah State University

http://www.vkedco.blogspot.com/

Outline

Spatial Semantic Hierarchy (SSH)

SSH Levels

Deduction, Induction, Abduction

Spatial Semantic Hierarchy

Background

Spatial Semantic Hierarchy (SSH) was

discovered and developed by Benjamin

Kuipers and his students

SSH is a model of knowledge of

environments that consists of multiple

interacting representations

These representations are quantitative and

qualitative

Types of Spatial Knowledge

Large-scale space – space that exceeds the

agent’s sensory horizon

Visual space – immediate environment that the

agent can explore by gaze

Graphical space – spatial layouts and relations

among symbols expressed graphically (e.g., on

paper or tablet)

The term cognitive map refers to human

knowledge of large-scale space

Why Study Spatial Knowledge?

Spatial knowledge is a fundamental type of

commonsense knowledge

We use spatial knowledge daily to navigate

We use spatial knowledge to represent concepts

graphically

We use spatial knowledge to organize virtual

worlds (e.g., the concept of computer desktop is

a virtual metaphor of a real desktop)

SSH Levels

SSH Levels

SSH consists of five levels: 1) sensory; 2)

control; 3) causal; 4) topological; and 5)

metrical

Kuipers organizes these levels in a lattice

of nodes where each node corresponds to a

representation with its own ontology (aka

conceptualization in the terminology of this

course), axioms, and inference rules

Relations among SSH Levels

Level X is dependent on level Y when X

“presupposes, or is defined in terms of, or

is inferred from, knowledge in the

representation at” Y

Level X receives information from level Y

when knowledge stored and/or computed at

Y is accessible to X

Organization of SSH Lattice Nodes

SSH lattice nodes are organized along two

dimensions: qualitative vs. quantitative

(horizontal) and ontological (vertical)

The horizontal level indicates that spatial

knowledge can be either qualitative or

quantitative

The vertical level organizes nodes in terms of

ontological dependencies

Sensory Level

Sensory level is the interface to the agent’s

sensorimotor system

Sensors can be continuous (camera, laser,

sonar, etc.) or discrete (digital compass,

odometer, RFID reader, Wi-Fi receiver, etc.)

Distinction continuous vs. discrete is

arbitrary because a continuous sensor’s

output can be made discrete

Control Level

Control level is a set of control laws that

bind the agent and its environment in a

dynamic system within some uniform

segment of that environment

Each control law has conditions for its

appropriateness and termination

There are two broad classes of control

laws: hill-climbing & trajectory following

Causal Level

Causal level discretizes the continuous

world and the agent’s actions in terms of

sensory views, actions, and relations among

views and actions

This is similar to the STRIPS action

semantics of the PDDL operators that we

have investigated before

Plans constructed at the causal level are

executed on the control level

Topological Level

Topological level describes the world in terms of

places, paths, regions, and their connectivity, order,

and containment

SSH makes a claim that a topological network map is

more effective for planning that the flat causal model

SSH makes another claim that “the ability to plan and

act is not dependent on the availability of quantitative

spatial knowledge”

The latter claim is very interesting, because, if it is

true, planning can be done without direct contact with

the world

Metrical Level

Metrical level is a global geometric map of the

environment with a single frame of reference

Kuipers appears to suggest that quantitative

geometric information is also present at each SSH

level: local analog maps at control level; action

magnitudes at causal level; headings and

distances at topological level

Smaller local frames can be linked into a global

frame of reference

Closer Look at SSH Levels

Control Level

Uniform Segments

Environment is divided into uniform segments each

of which has its own control law (e.g., follow middle

line, follow right wall, follow left wall, etc.)

The agent is assumed to use only sensory input to

execute control laws

The agent receives a continuous time series of

sensory values and outputs a continuous time series of

motor signals

A control law is a relation b/w sensory inputs and

motor outputs

Control Laws as Differential Equations

The agent, the environment, and a given

control law is described as a dynamic system

This system can be modeled by a system of

differential equation

The system’s behavior is described by a

solution to that system

Hill-Climbing vs. Trajectory Following

A hill-climbing control law brings the agent into

a locally distinctive state

A hill-climbing control law terminates when a

distinctiveness measure (e.g., distance) reaches

a local maximum

A trajectory-following control law brings the

agent from one distinctive state to the

neighborhood of the next

Low-Level Details of Control Laws

In Sections 2.1, 2.2., 2.3, 2.4, and 2.5 of his

article “The Spatial Semantic Hierarchy,” Kuipers

discusses low-level details of control laws

These are fascinating and insightful but are

peripheral to the NLP problems related to the

extraction of spatial knowledge from NL texts

Guarantees of Control Level

Guarantees of control level are more

interesting to us, because they specify what we

can assume about the agent’s physical abilities

There are two broad guarantees:

1) After a hill-climbing law terminates at a

distinctive state, at least one trajectory

following law is applicable (no dead ends)

2) After a trajectory-following law terminates

at least one hill-climbing law is applicable

Causal Level

Action Abstraction Schema

A sequence of control laws can be abstracted into an

action that starts at a given sensory view V and ends

at another sensory view V’

This abstraction is called the schema <V, A, V’>

Causal level, unlike control level, consists of

discrete states

The agent performs a sequence of discrete actions

that result in state transitions

View

A view is a description of the sensory input

vector s(t) = [s1(t), …, sn(t)]

My guess is that this definition is deliberately

vague to give the knowledge engineer a lot of

elbow space to play with various representations

For example, a description can specify a Wi-Fi

cluster or the color histogram of an image take

by the robot’s camera

Actions, Schemas, Routines

An action is a sequence of one or more control

laws

An action is initiated at a locally distinctive

state specified by one view description and

terminates at another locally distinctive state

specified by another view description

Actions are specified by schemas <V, A, V’>

A routine is a sequence of schemas indexed by

initial view

Declarative & Procedural Schema Interpretations

The first interpretation (declarative) means that a view V is

observed in situation s0 and the view V’ holds in the result of

executing action A in situation s0

The second interpretation (procedural) means that if a

view V is observed, then execute action A right away (now)

ℎ𝑜𝑙𝑑𝑠 𝑉, 𝑠0 && ℎ𝑜𝑙𝑑𝑠 𝑉′, 𝑟𝑒𝑠𝑢𝑙𝑡 𝐴, 𝑠0

ℎ𝑜𝑙𝑑𝑠 𝑉, 𝑛𝑜𝑤 ⇒ 𝑑𝑜 𝐴, 𝑛𝑜𝑤

Turns & Travels

At the causal level, all actions are classified into two categories: turn and

travel

This categorization may be too restrictive: SSH article states that one can

construct environments for which these categories break down

A claim is made, however, that these actions are sufficient for office spaces and

street networks

A turn is an action that leaves the agent in the same place; a travel is an action

that takes the agent from one place to another

𝑇𝑟𝑎𝑣𝑒𝑙 δ, ΔΘ , 𝑤ℎ𝑒𝑟𝑒 δ, ΔΘ 𝑎𝑟𝑒 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒 𝑎𝑛𝑑 𝑜𝑟𝑖𝑒𝑛𝑡𝑎𝑡𝑖𝑜𝑛 𝑐ℎ𝑎𝑛𝑔𝑒, 𝑟𝑒𝑝𝑠𝑒𝑐𝑡𝑖𝑣𝑒𝑙𝑦

𝑇𝑢𝑟𝑛 α , 𝑤ℎ𝑒𝑟𝑒 α 𝑖𝑠 𝑎𝑛 𝑟𝑜𝑡𝑎𝑡𝑖𝑜𝑛 𝑎𝑛𝑔𝑙𝑒

Routines

A routine is a sequence of schemas

A routine is indexed by its initial view, i.e., the view of

the 1st schema

A routine represents a behavior that moves the agent from

one distinctive state to another distinctive state

Figure 8 in the article “The Spatial Semantic Hierarchy”

seems to imply that distinctive states are described by

views

A routine can be viewed as a description of a behavior or

as a procedure for executing that behavior in the world

Complete & Partial Schemas in Routines

A schema of the form <V, A, V’> is complete

A schema of the form <V, A, nil> is partial

If a routine is defined in terms of complete

schemas, the agent can both execute and

describe it

If a routine is defined in terms of partial

schemas, the agent can only execute it in the

world but not describe it

Complete & Adequate Routines

Suppose 𝑽𝟎𝑨𝟎𝑽𝟏𝑨𝟏𝑽𝟐𝑨𝟐, … , 𝑽𝒏−𝟏𝑨𝒏−𝟏, 𝑽𝒏 is an

alternating sequence of views and actions

A routine R is complete from 𝑽𝟎 to 𝑽𝒏 if R has

a complete schema < 𝑽𝒊, 𝑨𝒊, 𝑽𝒊+𝟏 > for each

0 < 𝑖 < 𝑛

A routine R is adequate from 𝑽𝟎 to 𝑽𝒏 if it

contains either a complete schema

< 𝑽𝒊, 𝑨𝒊, 𝑽𝒊+𝟏 > or a partial schema

< 𝑽𝒊, 𝑨𝒊, 𝑽𝒊+𝟏 > for each 0 < 𝑖 < 𝑛

Complete & Adequate Routines

Adequate routines support situated action, i.e., behavior

that can be executed only when the agent is placed

(“situated”) in the world

Complete routines support both situated action and

cognitive manipulation, i.e., the agent can both execute

and describe the behavior

Example: if the agent can only navigate a route, the

navigation routine is adequate; if the agent can both

navigate and describe it, the navigation routine is complete

Deduction, Induction, & Abduction

Deduction

Given the implication A B and the truth of A, one

can infer B

Example:

- Implication: if the agent’s Wi-Fi classifier

classifies the input signal at cluster C, the agent

is at location L

- Truth of antecedent: the agent’s Wi-Fi classifier

classifies the input signal at cluster C

- Inference: the agent is at cluster location L

Induction

Infer the consequent B from the antecedent A if B

has so far always followed A

Example: you observe a swan and note that its

color is white; you observe another swan and note

that its color is white; you observe another n swans

and they are all white; you conclude that if a bird is

a swan (A), then its color is white (B)

Inductive inferences are always congruent with

agents’ experience but are not always be true: there

are, in fact, black swans

Abduction

Infer A as a possible antecedent of the observed

consequent B

In the literature on abduction, A is sometimes

referred to as a possible explanation of B

Example: you observe that your neighbor’s lot is

wet on a dry day (B) and infer that your neighbor has

watered the lawn (A); chances are your inference is

true but there is a chance that it is wrong: there

may, for example, be a leaking sprinkler

Topological Level

Elements of Topological Level

A place is a zero-dimensional part of the environment

A path is a one-dimensional subspace (e.g., a street in a

city)

There are two directions along a path: dir = +1 and dir =-1

These directions may be loosely interpreted and forward

and backward

A travel action moves the agent from one place on a path

to another place on a path

A turn action keeps the agent in the same place

A region is a 2D subset of the environment

A region may be abstracted into a place

Regions, Boundaries, Abstractions

Regions are sets of places

Places are grouped into regions because

1) they are located on one side of a specific

boundary;

2) they share a 2D metrical frame;

3) they are abstracted to the same place in a

higher-level topological map

Topological Abduction

Given a sequence of views and actions, the

agent infers places, paths, and regions by

abduction

The agent postulates a minimal set of places,

paths, and regions consistent with the views and

actions

Abduced elements may or may not be sufficient

to explain the sequence of observed views and

actions

Topological Relations

𝑎𝑡 𝑣𝑖𝑒𝑤, 𝑝𝑙𝑎𝑐𝑒 − 𝑣𝑖𝑒𝑤 is seen at 𝑝𝑙𝑎𝑐𝑒

𝑎𝑙𝑜𝑛𝑔 𝑣𝑖𝑒𝑤, 𝑝𝑎𝑡ℎ, 𝑑𝑖𝑟 − 𝑣𝑖𝑒𝑤 is seen along 𝑝𝑎𝑡ℎ in direction 𝑑𝑖𝑟

𝑜𝑛 𝑝𝑙𝑎𝑐𝑒, 𝑝𝑎𝑡ℎ − 𝑝𝑙𝑎𝑐𝑒 is on 𝑝𝑎𝑡ℎ

𝑜𝑟𝑑𝑒𝑟 𝑝𝑎𝑡ℎ, 𝑝𝑙𝑎𝑐𝑒1, 𝑝𝑙𝑎𝑐𝑒2, 𝑑𝑖𝑟 − the order on path from 𝑝𝑙𝑎𝑐𝑒1 to 𝑝𝑙𝑎𝑐𝑒2 is 𝑑𝑖𝑟

𝑟𝑖𝑔ℎ𝑡𝑂𝑓 𝑝𝑎𝑡ℎ, 𝑑𝑖𝑟, 𝑟𝑒𝑔𝑖𝑜𝑛 − 𝑝𝑎𝑡ℎ, facing direction 𝑑𝑖𝑟, has 𝑟𝑒𝑔𝑖𝑜𝑛 on its right

𝑙𝑒𝑓𝑡𝑂𝑓 𝑝𝑎𝑡ℎ, 𝑑𝑖𝑟, 𝑟𝑒𝑔𝑖𝑜𝑛 − 𝑝𝑎𝑡ℎ, facing direction 𝑑𝑖𝑟, has 𝑟𝑒𝑔𝑖𝑜𝑛 on its left

𝑖𝑛 𝑝𝑙𝑎𝑐𝑒, 𝑟𝑒𝑔𝑖𝑜𝑛 − 𝑝lace is in 𝑟𝑒𝑔𝑖𝑜𝑛

Topological Axioms

𝑜𝑟𝑑𝑒𝑟 𝑝𝑎𝑡ℎ, 𝐴, 𝐵, 𝑑𝑖𝑟 → 𝑜𝑛 𝐴, 𝑝𝑎𝑡ℎ & 𝑜𝑛(𝐵, 𝑝𝑎𝑡ℎ)

┐𝑜𝑟𝑑𝑒𝑟 𝑝𝑎𝑡ℎ, 𝐴, 𝐴, 𝑑𝑖𝑟

𝑜𝑟𝑑𝑒𝑟 𝑝𝑎𝑡ℎ, 𝐴, 𝐵, +1 ↔ 𝑜𝑟𝑑𝑒𝑟(𝑝𝑎𝑡ℎ, 𝐵, 𝐴, −1)

𝑜𝑟𝑑𝑒𝑟 𝑝𝑎𝑡ℎ, 𝐴, 𝐵, 𝑑𝑖𝑟 &𝑜𝑟𝑑𝑒𝑟 𝑝𝑎𝑡ℎ, 𝐵, 𝐶, 𝑑𝑖𝑟 → 𝑜𝑟𝑑𝑒𝑟(𝑝𝑎𝑡ℎ, 𝐴, 𝐶, 𝑑𝑖𝑟)

∃𝛼 𝑎𝑙𝑜𝑛𝑔 𝑉, 𝑝𝑎𝑡ℎ, 𝑑𝑖𝑟 & 𝑉, 𝑡𝑢𝑟𝑛, 𝛼 , 𝑉′ & 𝑎𝑙𝑜𝑛𝑔 𝑉′, 𝑝𝑎𝑡ℎ, −𝑑𝑖𝑟

Abducing Places & Paths from Views & Actions

Every view is observed at some place.

∀𝑣𝑖𝑒𝑤 ∃𝑝𝑙𝑎𝑐𝑒 𝑎𝑡 𝑣𝑖𝑒𝑤, 𝑝𝑙𝑎𝑐𝑒


If the agent turns, it does not change its place: 𝑉, 𝑡𝑢𝑟𝑛 𝛼 , 𝑉′ → ∃𝑝𝑙𝑎𝑐𝑒 𝑎𝑡 𝑉, 𝑝𝑙𝑎𝑐𝑒 &𝑎𝑡 𝑉′, 𝑝𝑙𝑎𝑐𝑒


If the agent travels a non-zero distance, then the first and second view exist at two distinct places. 𝑉, 𝑡𝑟𝑎𝑣𝑒𝑙 𝛼 , 𝑉′ & 𝛿 ≠ 0 → ∃𝑝𝑙𝑎𝑐𝑒1, 𝑝𝑙𝑎𝑐𝑒2 𝑝𝑙𝑎𝑐𝑒1 ≠ 𝑝𝑙𝑎𝑐𝑒2 & 𝑎𝑡 𝑉, 𝑝𝑙𝑎𝑐𝑒1 & 𝑎𝑡 𝑉′, 𝑝𝑙𝑎𝑐𝑒2


If the agent travels, then there are a path and direction such that the 1st view V exists on that path in that direction and the 2nd view V’ exists on that path in the same direction. 𝑉, 𝑡𝑟𝑎𝑣𝑒𝑙 𝛼 , 𝑉′ → ∃𝑝𝑎𝑡ℎ, 𝑑𝑖𝑟 𝑎𝑙𝑜𝑛𝑔(𝑉, 𝑝𝑎𝑡ℎ, 𝑑𝑖𝑟) & 𝑎𝑙𝑜𝑛𝑔 𝑉′, 𝑝𝑎𝑡ℎ, 𝑑𝑖𝑟


If the agent travels, then there are a path and a direction with two places such that the first place has the first view, the second place has the second view, both views exist along the path and can be ordered along the same direction. 𝑉, 𝑡𝑟𝑎𝑣𝑒𝑙 δ , 𝑉′ & 𝛿 ≠ 0 → ∃𝑝𝑙𝑎𝑐𝑒1, 𝑝𝑙𝑎𝑐𝑒2, 𝑝𝑎𝑡ℎ, 𝑑𝑖𝑟 𝑎𝑡(𝑉, 𝑝𝑙𝑎𝑐𝑒1) & 𝑎𝑡 𝑉′, 𝑝𝑙𝑎𝑐𝑒2 & 𝑎𝑙𝑜𝑛𝑔(𝑉, 𝑝𝑎𝑡ℎ, 𝑑𝑖𝑟 & 𝑜𝑟𝑑𝑒𝑟(𝑝𝑎𝑡ℎ, 𝑝𝑙𝑎𝑐𝑒1, 𝑝𝑙𝑎𝑐𝑒2, 𝑑𝑖𝑟 )

Upward & Downward Mapping

SSH supports upward and downward mapping

Upward mapping: multiple places at a lower level map to a

single place/region at a higher level

Downward mapping: a single place at a higher level map to

multiple places at a lower level

An abstraction region is the set of places in a more detailed

map abstracted to a particular place

Example: a corridor can be abstracted to a single place in a

higher-level map

Topological Level Uses

Topological level of representation supports

various problem-solving methods

It can be searched as a graph (DFS, BFS)

Distance measures, when and if they are

available, support A* and Dijkstra

Topological level can support goals and sub-

goals

Metrical Level

Global Metrical Mapping

An agent may have a global single frame of reference (2D

or 3D)

Many useful state of knowledge cannot be expressed

numerically in terms of real numbers (orientation error)

Storage of large global frame of references may present

problems

Global frame of reference can be split into a patchwork of

local frame of references

References & Reading Suggestions

B. Kuipers. (2000). “The Spatial Semantic Hierarchy.”

Artificial Intelligence 119, pp. 191-233.

J. Nicholson, V. Kulyukin, D. Coster. (2009). “ShopTalk:

Independent Blind Shopping Through Verbal Route

Directions and Barcode Scans.” The Open Rehabilitation

Journal, ISSN: 1874-9437 Volume 2, 2009, DOI

10.2174/1874943700902010011.

Speech & NLP (Fall 2014): Kuipers' Spatial Semantic Hierarchy (SSH)

Science

Transcript of Speech & NLP (Fall 2014): Kuipers' Spatial Semantic Hierarchy (SSH)