Kinect2 hands on

43
Kinect 2 Hands On Luigi Oliveto Researcher, Developer, IT Consultant Email: [email protected] Twitter: @LuigiOliveto LinkedIn: https://it.linkedin.com/in/luigioliveto

Transcript of Kinect2 hands on

Kinect 2 Hands On

Luigi Oliveto

Researcher, Developer, IT Consultant

Email: [email protected]

Twitter: @LuigiOliveto

LinkedIn: https://it.linkedin.com/in/luigioliveto

Agenda

• The Sensor

• System Requirements

• Architecture

• Data Sources

• Kinect Studio

• Gesture Recognition

2

The Sensor 3

Kinect 2 Sensor

Depth resolution: 512×424 pixels

RGB resolution: 1920×1080 pixels (16:9)

Frame rate: 30 FPS

Mic frequecy: 48 kHz

Range: from 0.5 to 4.5 m

4

USB hub

Power supply

3D DEPTH SENSOR

RGB CAMERA

MULTI-ARRAY MIC

Sensor

Kinect 1 VS Kinect 2 5

Feature Kinect for Windows 1 Kinect for Windows 2

Color Camera 640 x 480 @ 30 fps 1920 x 1080 @ 30 fps

Depth Camera 320 x 240 512 x 424

Max Depth Distance ~4.0 M ~4.5 M

Min Depth Distance 80cm (40 cm in near mode) 50 cm

Horizontal Field of View 57 degrees 70 degrees

Vertical Field of View 43 degrees 60 degrees

Tilt Motor yes no

Skeleton Joints Defined 20 joints 25 joints

Full Skeletons Tracked 2 6

USB Standard 2.0 3.0

Supported OS Win 7, Win 8 Win 8-8.1 (WSA)

Price (sensor + adapter) ~ €160 ~ €200

System Requirements 6

System Requirements

• Operating System • Windows 8/8.1 (x64)

• Windows 8/8.1 Embedded Standard (x64)

• Hardware• 64 bit processor (x64) i7 3.1Ghz (or higher)

• 4 GB memory (or more)

• Built-in USB 3.0 host controller

• DirectX11 capable graphics adapter: ATI Radeon (HD 5400 series, HD 6570, HD 7800), NVidia Quadro (600, K1000M), NVidia GeForce (GT 640, GTX 660), Intel HD 4400

• Kinect v2 sensor (with power supply and USB hub)

• Software• .NET Framework 4.5

• Visual Studio 2012 or higher

• Microsoft Speech Platform Software Development Kit (Version 11)

• Kinect for Windows SDKhttp://www.microsoft.com/en-us/download/details.aspx?id=44561

• Applications• Windows Presentation Foundation (WPF)

• Windows Store App

• Programming languages• C++, C#, VB.NET, …

7

https://dev.windows.com/en-us/kinect

Architecture 8

Architecture (1)

• Multiple Kinect-enabled applications can run simultaneously

9

Architecture (2)

• The sensor is a resource many applications can access it simultaneously

• The sensor gives a set of sources (functionalities)

• From every source it is possible to start readers

• Every reader gives events to acquire references to the device’s frames.

• From every frame it is possible to get data about the specific source (e.g. color image, body data, etc…)

10

Sensor Sources ReaderFrame

RefFrame

Sensor

• Sensor usage• Get an instance of KinectSensor

• Open the sensor

• Use the sensor

• Close the sensor

• In case of device unplug• The KinectSensor instance remain valid

• No more frames are sent/received

• The sensor IsAvailable property become false

11

Sensor Sources ReaderFrame

RefFrame

Sources

• The sensor exhibit a source for every functionality• Color source

• Depth source

• Infrared source

• Body Index source

• Body source (skeleton, hand tracking, lean…)

• Audio source

12

Sensor Sources ReaderFrame

RefFrame

Readers

• Give access to frames• Events

• Polling

• Multiple readers can be created for each source

• Reader can be paused

13

Sensor Sources ReaderFrame

RefFrame

Frame References

• Access current frame through AcquireFrame() method

• Frame contains metadata (i.e., for the color: format, width, height)

• MUST be managed quickly and then released (if a frame is not released other frames shouldn’t arrive)

14

Sensor Sources ReaderFrame

RefFrame

Frame

• Access frame data• Access raw buffer directly

• Take a local copy

15

Sensor Sources ReaderFrame

RefFrame

MultiSourceFrameReader

• Allows to get a matched set of frames from multiple sources on a single event

• Delivers frames at the lowest FPS of the selected sources

16

MultiSourceFrameReader MultiReader =Sensor.OpenMultiSourceFrameReader(FrameSourceTypes.Color |

FrameSourceTypes.BodyIndex |FrameSourceTypes.Body);

var frame = args.FrameReference.AcquireFrame(); if (frame != null) {

using (colorFrame = frame.ColorFrameReference.AcquireFrame())using (bodyFrame = frame.BodyFrameReference.AcquireFrame())using (bodyIndexFrame = frame.BodyIndexFrameReference.AcquireFrame())

{//

}}

Demo

Getting Started with Kinect 2 SDK

17

Data Sources 18

Kinect Data Sources – Color

• 1920 x 1080 array of color pixels• 30 or 15 fps, based on lighting conditions

• Elaborated Image Format: • RGBA, BGRA, YUY2, …

• Raw Format: YUY2

• Frame data can be:• Used in raw format

• Converted to other formats (with a computational cost)

• The Buffer is a byte array.

• The number of bytes per pixel depends on raw format (now is 4 bytes per pixel).

19

Kinect Data Sources – Infrared

• 512 x 424 pixel @ 30 fps

• Same physical sensor of the depth source

• Two sources:• Infrared: single infrared frame

• LongExposureInfrared: overlapping of 3 frames (better ratio signal/noise but images with blurry effect)

• Every pixel is composed by 2 byte (16-bit) and represent the IR intensity value

• Ambient light removed: the SDK get only the reflection of the infrared light, projected by the device

20

Kinect Data Sources – Depth

• 512 x 424 pixel @ 30 fps

• Range: 0.5 – 4.5 meters (Extended Depth to 8m)

• Every pixel is composed by 2 byte (16-bit) and contain the distance in millimeters from the sensor’s focal plane

• Player index not present

21

Demo

Color, Infrared and Depth sources

22

Kinect Data Sources – Body Index

• 512 x 424 @ 30 fps

• Every pixel is composed by 1 byte

• Pixel Data • 0 to 5: Index of the corresponding body,

as tracked by the body source

• > 5: No tracked body at that pixel

23

Kinect Data Sources – Body

• Range is 0.5-4.5 meters

• 30fps

• Frame data is a collection of Body objects

• Each body has • 25 joints (each joint has position in 3D space

and orientation)

• Hand tracking (open, close, “lazo”)

• Face tracking and expressions

• Bones’ orientation

• Up to 6 simultaneous bodies

• Hand State on 2 bodies

24

Body information

• The Body class contains useful properties:• ClippedEdges: edges of the Field of View that clip the body

• HandState [Left/Right]: { Unknown, NotTracked, Closed, Open, Lasso }

• HandConfidence [Left/Right]: { High, Low }

• IsRestricted

• IsTracked

• TrackingId: 64-bit unique id

• Joints: position in the space of each joint

• JointOrientations: orientation in the space of the articulation

• Lean: inclination vector of the body

• LeanTrackingState: { Inferred, NotTracked, Tracked }

• Up to 6 bodies simultaneously

• Up to 2 players’ hands simultaneously

25

Skeleton VS Body 26

Kinect 1 Kinect 2

Demo

Body source

27

Kinect Data Sources – Audio

• Frame data is an Audio Beam

• Readers and event as previous sources

• Acquire frames through AcquireBeamFrames() method

28

Coordinate System

• ColorSpace (Coordinate System of the Color Image)• … Color

• DepthSpace (Coordinate System of the Depth Data)• … Depth, Infrared, BodyIndex

• CameraSpace (Coordinate System with the origin located to the sensor)• … Body (Joint)

29

Coordinate Mapper

• Three coordinate systems

• Coordinate mapper provides conversions between each system

• Convert single or multiple points

30

Name Applies to Dimensions Units Range Origin

ColorSpacePoint Color 2 pixels 1920x1080 Top left corner

DepthSpacePoint Depth,

Infrared,

Body index

2 pixels 512x424 Top left corner

CameraSpacePoint Body 3 meters – Infrared/depth

camera

Interaction 31

Hand Pointer Gestures 32

Engagement Targeting Press/Pan/Zoom

Hand Pointer States 33

Kinect Region & User Controls

• The KinectRegion user control define a part of the user interface (XAML) where the user can interact with an hand pointer

• The region must be connected to the sensor instance

• Available gestures (“out-of-the-box”) usable into a KinectRegion:• Click

• Grab

• Pan

• Zoom

• KinectUserViewer gives a visual feedback related to the tracked state of the users

• Re-use default user controls

34

Demo

User Interaction

35

Kinect Studio 36

Recording, Playback, and Gesture Recognition 37

Recordable Data Sources 38

Infrared

13 MB/s

Depth

13 MB/s

BodyFrame

BodyIndex

Color

120 MB/s

Audio

32 KB/s

Legend

Record/Play

Record Only

Gesture Recognition 39

Gesture Recognition 40

• Gesture is a coding problem

• Quick to do simple gestures/poses (hand over head)

• ML can also be useful to find good signals for Heuristic approach

• Gesture is a data problem

• Signals which may not be easily human understandable (progress in a baseball swing)

• Large investment for production

• Danger of over-fitting, causes you to be too specific – eliminating recognition of generic cases

Heuristic Machine Learning (ML) with G.B.

Visual Gesture Builder (1)

• New tool integrated with v2 SDK

• Organize data using projects and solutions

• Give meaning to data by tagging gestures

• Build gestures using machine learning technology• Adaptive Boosting (AdaBoost) Trigger

• Determines if player is performing gesture

• Random Forest Regression (RFR) Progress

• Determines the progress of the gesture performed by player

• Analyze / test the results of gesture detection

• Live preview of results

41

Visual Gesture Builder (2) 42

Your Application

Resources

• General Info & Blog https://dev.windows.com/en-us/Kinect

• Purchase Sensor http://goo.gl/ZsMtBx

• Developer Forums https://goo.gl/bpptyq

• Twitter Account @KinectWindows

• A Facebook Group http://on.fb.me/1LSflbX

• A LinkedIn Group http://linkd.in/1J9gFcY

• A Twitter Account @KinectDevelop

• A Google Plus Page http://bit.ly/1SHtduT

43