Vision-Based Interactive Systems Martin Jagersand c610.

Vision-Based Interactive Systems

Martin Jagersand

c610

Applications for vision in User Interfaces

Interaction with machines and robots– Service robotics– Surgical robots– Emergency response

Interaction with software– A store or museum information kiosk

Service robots

Mobile manipulators, semi-autonomous

DIST TU Berlin KAIST

TORSO with 2 WAMs

Service tasks

This is completely hardwired! Found no real task on WWW

But

Maybe first applications in tasks humans can’t do?

Why is humanlike robotics so hard to achieve?

See human task:– Tracking motion, seeing gestures

Understand:– Motion understanding: Translate to correct

reference frame– High level task understanding?

Do: – Vision based control

Types of robotic systems

Autonomy

Generality

Supervisory control

Tele-assistance

Programming by demonstration

Preprogrammed systems

Interaction styles

If A then

end

Conventional: • Low bandwidth interaction

• Partial or indirect system state displayed

• User works from internal mental model

Interaction styles

Direct ManipulationDirect Manipulation:•High bandwidth interactionHigh bandwidth interaction

•Interact directly and intuitively with objects (affordance)Interact directly and intuitively with objects (affordance)

•See system state (visibility)See system state (visibility)

•(Reversible actions)(Reversible actions)

Examples of Direct Manipulation

Drawing programs e.g. Mac Paint Video games, flight simulator Robot/machine teaching by showing Tele-assistance Spreadsheet programs Some window system desktops

But can you always see effects (visibility)?

xfig drawing program

Icons afford use Results visible Direct spatial action-

result mapping

line([10, 20],[30, 85]);patch([35, 22],[15, 35], C);

% C complex structuretext(70,30,'Kalle'); % Potentially add font, size, etc

matlab drawing:matlab drawing:

Why direct manipulation?

Recognition quicker than recall. Human uses “the world” as memory/model Human skilled at interacting spatially

How quick is direct? Subsecond! Experiments show human Subsecond! Experiments show human

performance decreased at 0.4s delay.performance decreased at 0.4s delay.

Vision and Touch based UI

Typical UI today: Symbolic, 1D (slider), 2D But human skilled at 3D, 6D, n-D spatial

interaction with the world

Supports Direct Manip!

Seeing a task

Tracking movement– See directions, movements in tasks

Recognizing gestures– Static hand and body postures

Combination: Spatio-temporal gestures

Tracking movement

Tracking the human is hard:– Appearance varies– Large search space, 60 parameters– Unobservable: Joint angles have to be inffered from

limb positions, clothing etc.– Motion is non-linear.– Difficult to track 3D from 2D image plane info– Self occlusion of limbs

Trick 1:Physical model

Reduce number of DOF’s by coupled model of articulated motion (Hedvig, Mike)

Trick 2:Use uniqueness of skin color

Can be tracked at real time

Gestures:

Identifying gestures is hard– Hard to segment hand parts– Self occlusion– Variability in viewpoints

Trick 3:Scale space

Define hand gesture in course to fine terms

Trick 4:Variability filters

Programming by Demonstration

From assembly relations From temporal assembly sequence

– Segmenting manipulation sequence into parts (subtasks) is hard

Using a gesture language

Tele-assistance:

Gestures + context

Robust manipulations

Conclusions

Most aspects of Robot see – robot do are hard Conventional methods are

– Incapable of seeing task– Incapable of understanding what’s going on– Incapable of performing human manipulation tasks

Uncalibrated methods are more promising

Vision-Based Interactive Systems Martin Jagersand c610.

Documents

Transcript of Vision-Based Interactive Systems Martin Jagersand c610.