2016 AR Summer School - Lecture 5

78
LECTURE 5: DIRECTIONS FOR FUTURE RESEARCH Mark Billinghurst AR Summer School February 15 th – 19 th 2016 University of South Australia

Transcript of 2016 AR Summer School - Lecture 5

LECTURE 5: DIRECTIONS FOR FUTURE RESEARCH

Mark Billinghurst

AR Summer School February 15th – 19th 2016 University of South Australia

Looking to the Future

The Future is with us It takes at least 20 years for new

technologies to go from the lab to the lounge..

“The technologies that will significantly affect our lives over the next 10 years have been around for a decade.

The future is with us. The trick is learning how to spot it. The commercialization of research, in other words, is far more about prospecting than alchemy.”

Bill Buxton

Oct 11th 2004

Research Directions • Tracking

• Markerless tracking, hybrid tracking • Displays

• Occlusion, Retinal, light field • Interactions

•  Input devices, gesture, social • Applications

• Collaboration • Scaling Up

• User evaluation, novel AR/MR experiences

TRACKING

Wide Area Tracking

•  Process •  Combine panorama’s into point cloud model (offline) •  Initialize camera tracking from point cloud •  Update pose by aligning camera image to point cloud •  Accurate to 25 cm, 0.5 degree over wide area

Ventura, J., & Hollerer, T. (2012). Wide-area scene mapping for mobile visual tracking.In Mixed and Augmented Reality (ISMAR), 2012 IEEE International Symposium on (pp. 3-12). IEEE.

Large Scale Depth Fusion and Tracking

•  InfinitAM •  http://www.robots.ox.ac.uk/~victor/infinitam/ •  swaps memory between CPU & GPU in real-time, virtually infinite environments •  over 1000fps on a single NVIDIA Titan X graphics card and real-time on iOS/Android

Kahler, O., Prisacariu, V. A., Ren, C. Y., Sun, X., Torr, P., & Murray, D. (2015). Very high frame rate volumetric integration of depth images on mobile devices. Visualization and Computer Graphics, IEEE Transactions on, 21(11), 1241-1250.

Project Tango

• Smart phone + Depth Sensing • Sensors

• Gyroscope/accelerometer/compass • 180º field of view fisheye camera • An infrared projector. • 4 MP RGB/IR camera

How it Works

• Sensors •  4MP RGB/IR camera : can capture full color images and

detect IR reflections. •  IR Depth Sensor : Used to measure depths with IR pulse •  Tracking Camera : To track objects

• 3 Basic operations •  In real time can map depth of environment • Measure depth accurately using IR pulse • Create a 3D model of the environment real time

Applications

• Indoor tracking, games, disability, etc

DISPLAYS

Occlusion with See-through HMD • The Problem

• Occluding real objects with virtual • Occluding virtual objects with real

Real Scene Current See-through HMD

ELMO (Kiyokawa 2001)

• Occlusive see-through HMD • Masking LCD • Real time range finding

ELMO Demo

ELMO Design

• Use LCD mask to block real world • Depth sensing for occluding virtual images

Virtual images from LCD

Real World

Optical Combiner

LCD Mask Depth Sensing

ELMO Results

Contact Lens Display •  Babak Parviz

•  University Washington

• MEMS components •  Transparent elements •  Micro-sensors

• Challenges •  Miniaturization •  Assembly •  Eye-safe •  Providing power, data

Contact Lens Prototype

Wide FOV Displays

• Wide FOV see-through display for AR •  LCD panel + edge light point light sources •  110 degree FOV

Maimone, A., Lanman, D., Rathinavel, K., Keller, K., Luebke, D., & Fuchs, H. (2014). Pinlight displays: wide field of view augmented reality eyeglasses using defocused point light sources. In ACM SIGGRAPH 2014 Emerging Technologies (p. 20). ACM.

Light Field Displays

•  Nvidia Prototype •  Thinner, sharper, depicting

accurate accommodation, convergence, and binocular-disparity depth cues

INTERACTION

The Vision of AR

To Make the Vision Real..

• Hardware/software requirements • Contact lens displays • Free space hand/body tracking • Environment recognition • Speech/gesture recognition • Etc..

Natural Interaction

• Automatically detecting real environment • Environmental awareness • Physically based interaction

• Gesture Input • Free-hand interaction

• Multimodal Input • Speech and gesture interaction • Implicit rather than Explicit interaction

AR MicroMachines

• AR experience with environment awareness and physically-based interaction • Based on MS Kinect RGB-D sensor

• Augmented environment supports • occlusion, shadows • physically-based interaction between real and virtual objects

Operating Environment

Architecture • Our framework uses five libraries:

• OpenNI • OpenCV • OPIRA • Bullet Physics • OpenSceneGraph

System Flow • The system flow consists of three sections:

•  Image Processing and Marker Tracking • Physics Simulation • Rendering

Physics Simulation

• Create virtual mesh over real world

• Update at 10 fps – can move real objects

• Use by physics engine for collision detection (virtual/real)

• Use by OpenScenegraph for occlusion and shadows

Rendering

Occlusion Shadows

Gesture Based Interaction

• Use free hand gestures to interact • Depth camera, scene capture

• Multimodal input • Combining speech and gesture

HIT Lab NZ Microsoft Hololens

Meta SpaceGlasses

Natural Gesture Interaction on Mobile

• Use mobile camera for hand tracking •  Fingertip detection

Capturing Behaviours

▪ 3 Gear Systems ▪ Kinect/Primesense Sensor ▪ Two hand tracking ▪ http://www.threegear.com

Performance

▪ Full 3d hand model input ▪  10 - 15 fps tracking, 1 cm fingertip resolution

Multimodal Interaction

• Combined speech input • Gesture and Speech complimentary

• Speech • modal commands, quantities

• Gesture •  selection, motion, qualities

• Previous work found multimodal interfaces intuitive for 2D/3D graphics interaction

Free Hand Multimodal Input

• Use free hand to interact with AR content • Recognize simple gestures • No marker tracking

Point Move Pick/Drop

Multimodal Architecture

Multimodal Fusion

Hand Occlusion

User Evaluation

• Change object shape, colour and position • Conditions

• Speech only, gesture only, multimodal

• Measure • performance time, error, subjective survey

Experimental Setup

Change object shape and colour

Results • Average performance time (MMI, speech fastest)

• Gesture: 15.44s • Speech: 12.38s • Multimodal: 11.78s

• No difference in user errors • User subjective survey

• Q1: How natural was it to manipulate the object? •  MMI, speech significantly better

•  70% preferred MMI, 25% speech only, 5% gesture only

COLLABORATION

Resolution Tube

•  http://www.resolutiontube.com/ •  Shared video calls with annotations

Vipaar Lime - https://www.vipaar.com/

•  Remote collaboration on handheld •  Remote users hands appear in live camera view

SOCIAL IMPLICATIONS

Consider the Whole User

How is the User Perceived?

TAT Augmented ID

Social Acceptance

• People don’t want to look silly •  Only 12% of 4,600 adults would be willing to wear AR glasses •  20% of mobile AR browser users experience social issues

• Acceptance more due to Social than Technical issues •  Needs further study (ethnographic, field tests, longitudinal)

CROSSING BOUNDARIES

Crossing Boundaries

Jun Rekimoto, Sony CSL

Invisible Interfaces

Jun Rekimoto, Sony CSL

Milgram’s Reality-Virtuality continuum

Mixed Reality

Reality - Virtuality (RV) Continuum

Real Environment

Augmented Reality (AR)

Augmented Virtuality (AV)

Virtual Environment

The MagicBook

Reality Virtuality Augmented Reality (AR)

Augmented Virtuality (AV)

Invisible Interfaces

Jun Rekimoto, Sony CSL

Example: Visualizing Sensor Networks

•  Rauhala et. al. 2007 (Linkoping) • Network of Humidity Sensors

•  ZigBee wireless communication

• Use Mobile AR to Visualize Humidity

Invisible Interfaces

Jun Rekimoto, Sony CSL

Ubiquitous AR (GIST, Korea)

• How does your AR device work with other devices? • How is content delivered?

CAMAR - GIST

(CAMAR: Context-Aware Mobile Augmented Reality)

Requirements for Ubiquitous AR • Hardware is available (mobile phones). •  Required are software standards:

•  APIs for common framework, independent of hardware. •  ARML as descriptor language for AR environment, scenario, etc.

•  Further required: •  Authoring tools for creating AR applications •  AR Enabled infrastructure (buildings etc)

Reality Virtual Reality

Terminal

Ubiquitous

Desktop AR VR

Milgram

Weiser

UbiComp

Mobile AR

Ubi AR

Ubi VR

SCALING UP

Reality

VR

Ubiquitous

Terminal

Milgram

Weiser

Single User

Massive Multi User

Massive Multiuser

• Handheld AR for the first time allows extremely high numbers of AR users

• Requires • New types of applications/games • New infrastructure (server/client/peer-to-peer) • Content distribution…

Social Network Systems •  2D Applications

•  MSN – 29 million •  Skype – 10 million •  Facebook – up to 70m

•  Desktop VR •  SecondLife > 50K •  Stereo projection - <500

•  Immersive VR •  HMD/Cave based < 100

•  Augmented Reality •  Shared Space (1999) - 4 •  Invisible Train (2004) - 8

PERSONAL VIEW

Augmented Reality 2.0 Infrastructure

Leveraging Web 2.0 • Content retrieval using HTTP • XML encoded meta information

• KML placemarks + extensions • Queries

•  Based on location (from GPS, image recognition) •  Based on situation (barcode markers)

•  Syndication • Community servers for end-user content •  Tagging

• AR client subscribes to data feeds

Scaling Up

• AR on a City Scale • Using mobile phone as ubiquitous sensor • MIT Senseable City Lab

•  http://senseable.mit.edu/

WikiCity Rome (Senseable City Lab MIT)

www.empathiccomputing.org

@marknb00

[email protected]