Research Article Kinect-Based Vision System of Mine Rescue...

Research ArticleKinect-Based Vision System of Mine Rescue Robot forLow Illuminous Environment

Wanfeng Shang Xiangang Cao Hongwei Ma Hailong Zang and Pengkang Wei

School of Mechanical Engineering Xirsquoan University of Science amp Technology Xirsquoan Shaanxi 710054 China

Correspondence should be addressed to Wanfeng Shang shangwanfenggmailcom

Received 25 December 2014 Accepted 4 February 2015

Academic Editor Yong Zhang

Copyright copy 2016 Wanfeng Shang et alThis is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

This paper presents Kinect-based vision system of mine rescue robot working in illuminous underground environment Thesomatosensory system of Kinect is used to realize the hand gesture recognition involving static hand gesture and action A K-curvature based convex detection method is proposed to fit the hand contour with polygon In addition the hand action iscompleted by using theNiTE librarywith the framework of hand gesture recognition In addition the proposedmethod is comparedwith BP neural network and templatematching Furthermore taking advantage of the information of the depthmap the interface ofhand gesture recognition is established for humanmachine interaction of rescue robot Experimental results verify the effectivenessof Kinect-based vision system as a feasible and alternative technology for HMI of mine rescue robot

1 Introduction

Rescue is very urgent after coalmine accident because after 48hours victim mortality drastically increases owing to expo-sure to bad air and lack of food water andmedical treatment[1 2]Themine tunnelmay further collapse injuring or killingthe trapped survivors or rescuers Noxious gas leaks andexplosions are also possible At such time robots have a greatpotential to assist in these underground operations ahead ofrescue teams and report conditions that may be hazardousto people When explosive conditions exist or heavy smokeenters the mine roadway robots can become an invaluabletool These mobile robots navigating deep into rubbles cansearch for survivors and transfer on-site video and atmo-spheric monitoring information for rescuers to confirm safestate or identify potentially hazardous conditions for minerescue team [3]

It is an urgent challenge for intelligent rescue robotsworking underground mine tunnel that is a long distancefrom ground After mine accidents occur the environmentof undergroundmine tunnel is unknown and low illuminousand even has no light (the power system has collapsed)With the development of mobile devices and sensors voiceand hand gestures have become a popular way to interactwith robots and personal computers [4] Voice is vulnerable

to environmental noise and is not suitable for continuousremote control It is difficult to separate different peoplersquosintonation Further survivors trapped under groundmay noteasily take a breath for exiting harmful gas So the voice datais unfit for human machine interaction (HMI) undergroundmine tunnel In comparisonwith voice gestures aremore nat-ural intuitive and robust Besides gestures can convey moreextensive amount of information than sound Thus manyresearchers pay attention to gesture recognition research

Currently most technologies of the hand gesture recogni-tion are based on ordinary cameras [5] However RGB colorimages from the ordinary camera have a disadvantage thatthey cannot provide enough information for tracking handsin 3D space because much of the spatial position informationhas to be inferred from 2D-3Dmappings During the imagingprocess a lot of information is lost In recent years withthe development of somatosensory interactive devices theresearch methods based on the depth camera are rising Thecamera is capable of obtaining the 3D depth informationsuch as TOF camera [6] stereophotogrammetric system [7]Both of them are very accurate but their use is limited by highcost time space and expertise requirements Kinect providesa possibility for improving the tradeoff between performanceand price in the design of gesture interfaces

Hindawi Publishing CorporationJournal of SensorsVolume 2016 Article ID 8252015 9 pageshttpdxdoiorg10115520168252015

2 Journal of Sensors

USB

Kinect camera

CANopen

CANopen

Actuator

Tracked mobile robot

Figure 1 Architecture of vision system of rescue robot

Kinect is a new integrated 3D sensor capturing depthimage andRGB image It has beenwidely used inmany differ-ent fields such as entertainment [8] industry automation [9]medical applications [10] and remote control of robot [11]because of its inexpensive depth sensingThe complementarynature of the depth and visual RGB information in Kinectsensor opens up new opportunities to solve fundamentalproblems in human behavior recognition for HMI On thecondition that the recognition environment is very dark thetarget recognition will be conventionally out of functionWhile a depth image captured by Kinect is not affected bylights the target recognition cannot be affected by complexbackground or lights in the field of object recognition Hencethe target or hand gesture can be recognized in the dark orin places of complex images with Kinect Furthermore theordinary camera is unfit for working underground whereits image may lose detailed information It is because theenvironment of underground mine tunnel is low illuminouseven dark in part of tunnel Therefore the Kinect is a viablesensor to use for HMI of rescue robot underground miningtunnel

However few researches are about the application ofKinect to a deployed rescue robot though theKinect has beenused extensively in robot for navigation [12] environmentmapping [13] and object manipulation [14] One undergrad-uate team from the University of Warwick installed Kinecton robot to perform 3D SLAM in simulated disaster settingsnamely the RoboCup Rescue competition [15] But theenvironment is indoors and moreover the highly structuredRoboCup course (made largely of diffuse plywood) lendsitself easily to the task Suarez and Murphy [1] reviewedKinectrsquos use in rescue robotics and similar applications andhighlighted the associated challenges but did not detail theKinect operation technology on rescue robot for hand ges-ture navigation remote control and so forth Developmentof Kinect makes some classical recognition methods andalso introduces some new research technologies to improvehuman behavior recognition So far Kinect-based full-body3D motion estimation has been exploited to track bodyjoints [8ndash10] while accurately recognizing small gestures inparticular hand motions is still hard work Many researches

mainly include hand gesture and body parts recognition andposture tracking Raheja et al [16] used Kinect to trackfingertip and palm via its SDK apps Thanh et al [17] teamcaptured 3D histogram data of finger joint information fromKinect images and recognized hand gesture with TF-IDFalgorithm Doliotis et al [18] proposed a clutter-toleranthand segmentation algorithm where 3D pose estimation wasformulated as a retrieval problem but the local regressions forsmoothing the sequence of hand segmentationwidths did notallow the algorithm to achieve real-time performance Withthe aim to extend the set of recognizable gestures Wan et al[19] proposed a method to recognize five gestures usingKinect but one limitation of this work is the impossibilityto detect vertical gestures Pedersoli et al [20] displayeda unified open-source framework for real-time recognitionof both static hand-poses and dynamic hand gestures butrelying only on depth information Based on the above itcan be attained that hand gesture recognition depends moreon depth map while body posture is liable to focus on bodyskeleton joint data captured from Kinect

Therefore the paper proposes a Kinect-based visionsystem for rescue robot underground mining tunnel andachieves hand gesture recognition combining depth map andskeleton joint information

This paper is organized as follows Section 2 introducesthe architecture of Kinect-based vision system of rescuerobot Section 3 presents the hand gesture recognition in lowilluminous environment using OpenNI and NiTE library forWindowsThe experimental results are analyzed in Section 4Finally the conclusion and some future work are given inSection 5

2 Architecture of Kinect-Based Vision System

The Kinect-based vision system architecture of rescue robotfor gesture recognition is shown in Figure 1 The system iscomposed of motor actuators Kinect PC and a platform oftrackedmobile robot used undergroundmine tunnel Kinectdirectly connected to host computer through USB port isactually a 3D somatosensory camera with a low price Onthe Kinect there is an Infrared (IR) Projector a Color (RGB)

Journal of Sensors 3

Start

Open Kinect initialOpenNI and NiTE interface

functions

Capture RGB image and depth map

Process image stream

Static hand gesture

recognition

Dynamic hand gesture

recognition

Send control commandto drive robot

Stop recognizing

Close Kinect

End

Yes

No

Yes

No

Yes

Figure 2 Hand gesture recognition workflow

Camera and an Infrared (IR) Sensor For purposes of 3Dsensing the IR Projector emits a grid of IR light in frontof it This light then reflects off objects in its path and isreflected back to the IR SensorThe pattern received by the IRSensor is then decoded in the Kinect to determine the depthinformation and then is sent to another device via USB forfurther processing It offers 640 lowast 480 pixelsrsquo (about 30 fps)RGB image and depth image Each RGB image is 24 bits Eachpixel in the depth image is 16 bitsmm whose effective depthranges from 0mm to 4096mm Kinect software is capableof automatically calibrating the sensor based on the userrsquosphysical environment accommodating for the presence ofobstacles

The system in Figure 1 is a slave control model withKinect-based vision system The computer acquires depthinformation of hand gesture from survivors gives recognitionresults and then transmits commands to actuators to controlrobots for serving survivors or other operators It mainlyserves survivors tapped in collapsed mining tunnel Whenthe gas in underground tunnel is heavy smoke and too badfor these survivors to breath they cannot make any sound

At this time the only way to control rescue robot is handgesture recognition of HMI for survivors

3 Hand Gesture Recognition

Generally hand gestures fall into two categories such asstatic gestures and dynamic gestures Both of them requireproper recognition means by which they can be properlydefined to the machine Since OpenNI and NiTE libraryown the framework of Kinect this paper used these libraryfunctions to realize the simple hand action while being morefocused on recognizing details about static hand gesture Theworkflow is showed by Figure 2

31 Capture Depth Image For the hand gesture recognitionthe first step is to obtain the image data captured fromKinectWith the OpenNI and NiTE we can obtain the images in640 lowast 480 pixelsrsquo resolution at 30 fps The OpenNI standardAPI enables natural-interaction developers to track real-life(3D) scenes by utilizing data calculated from the input ofa sensor for example representation of a hand locationOpenNI is an open-source API that is publicly available


Image capture Hand segment

Fitting hand contour with polygon Contour extraction

Hand gesture recognition

Figure 3 Workflow of static hand gesture recognition

In this project the Kinect is driven by the program ofOpenNI and the usage of NiTE library is based on the driverThe depth map needs to be captured by NiTE class Hencethere must be three steps involving initialization OpenNIOpen device and initialization NiTE to open Kinect Beforeusing OpenNI classes and functions the necessary step isto initialize OpenNI and then we can acquire the deviceinformation by defining the object of a device class and usingan open function When using NiTE classes and functionsit is similar to OpenNI namely it needs to be initialized indefined function Through the above steps Kinect can beused normally

NiTE includes hand tracking framework and gesturerecognition framework which is completed on the basis ofthe depth data The method of obtaining the depth datais encapsulated in the HandTracker class and the classprovidesmethods to get hand position transform coordinateand detect hand gesture The HandTrackerRef class storesthe hand ID and the recognition results Finally the depthinformation captured needs to be converted into a gray scaleimage

32 Static Hand Gesture Recognition In above section wehave obtained the depth map and RGB data about handgestures In the next we need to recognize the fingertip andcount the number based on the hand tracking data TheKinect with the released NiTE framework can provide 20major joints of human skeleton but only one joint is aboutpalm information to locate a hand The palm joint doesnot include the contour of the palm fingers and fingertipsinformation and then not identify the static gestures orfingers Hence the paper proposes fingertip recognitionbased on extraction contour of straight and bend fingerThe workflow of static hand gesture recognition is shown inFigure 3 The process is to first locate and segment a handfrom the depth streams and then to extract contour anddetection fingertip using119870-curvature algorithm

321 Hand Segment To get the contour of hand we need tosegment hand from the depth streams In this project we usethe segmentation method based on the depth thresholdingThe segmentation range is from the depth value of a handcenter point to the value adding or subtracting the depththreshold

Let us define the hand skeleton joint as 119862119894 119894 = left and

right respectively representing the palm center point of lefthand and right hand We can easily get 119862

119894from depth image

of hand gesture using open-source function of Kinect SDKsoftware The distance between the hand center point 119862

119894and

Kinect 119889119894can be expressed by

119889119894= frame sdot width times 119862

119894sdot 119884 + 119862

119894sdot 119883 (1)

where frame represents a framework of depthmap and widthdenotes a width value of depth image Each pixel in depthimage is 16 bits but only 13 bits is effective for depth valueSo the true depth of each pixel depth[119895] can be achieved by

119889119895= depth [119895] ≫ 3 (2)

The judging rule for hand segment is defined as

119904119889 =

1 119889119895le 119889119894+ 119904

minus1 119889119895gt 119889119894+ 119904

(3)

where 119904 is a little constant denoting a depth threshold and 119904119889

is a logic result to judge whether each pixel point of depthimage is within the palm If each pixel value in a depth imageis less than the sum of the depth of skeleton joint and a depththreshold value the pixel point is within the palm otherwiseit is not within the palm In addition if 119904 lt 0 the thresholdvalue 119904 has a change range which can be estimated accordingto length of bend finger According to (1)ndash(3) we get thesegment result of one of hand gestures as shown by Figure 4

In fact there is no absolute static state and hand gestureis generally dynamic and we need the recognition systemto respond fast to further remote control or navigation orcomplex HMIThus the recognition program of hand gesturemust run fast The method provides a feasibility of a simpleand fast operation to ensure the accuracy of the dynamic handsegmentation

322 Hand Contour Extraction Before extracting the handcontour we have to improve the image further In Figures 5and 6 due to the RGB camera and the depth camera notlocating at the same position their pixel coordinates of acorresponding object in RGB image and depth image cannotbe matched There exist some shadows around the handwhich can severely affect the performance of the recognitionBesides there is inevitable noise in the images Thus we mustpreprocess the images to eliminate them This paper appliesthe median filter method [20] to eliminate the noise and usesthe image binarization method to extract the hand shapeas indicated in Figure 5 Due to the difference of gray scalebetween black and white the hand contour is very distinctWe can use the open-source function of OpenNI about edgeprocess to effectively extract the hand contour as displayed inFigure 6

323 Fitting Hand Contour with Polygons For a detailedrecognition of hand gesture the basic idea is to fit the handcontour with polygons and then count the hand finger byidentifying the convex points and concave points betweenfingers This paper uses 119870-curvature algorithm to recognizefingertips that is convex points and concave points of palmAs shown in Figure 7 the point 119862 is the palm center point


5

Palm center point

(a) Depth image (b) Hand segmentation

Figure 4 Hand segmentation result

Figure 5 Image binarization

119875(119894) is any point of the hand contour 119875(119894+119896) is the119870th pointafter 119875(119894) and 119875(119894 minus 119896) is the 119870th point before 119875(119894)

1198811is a

vector that points from 119875(119894 + 119896) to 119875(119894) 1198812is a vector that

points from 119875(119894 minus 119896) to 119875(119894) We take a cosine function of theangle 120572 between

1198811and

1198811as 119870-curvature of 119875(119894) point The

angle 120572 can be expressed by

cos120572 =

1198811sdot

1198812

10038161003816100381610038161003816

1198811sdot

1198812

10038161003816100381610038161003816

(4)

By judging whether the angle 120572 is within a certainangle we can further decide whether the 119875(119894) point in thehand contour is convex or concave points for a finger It isimportant to choose a proper angle threshold to judge the120572 Ifthe angle threshold is too large the wrist neighborhood maybe misunderstood for fingertip while a too little thresholdwill cause recognization failure This paper used the anglethreshold less than 55 degrees that was verified many timesIf the 120572 is less than the defined angle threshold the 119875(119894) in thehand contour is a convex or concave point of a palm Assume

Figure 6 Hand contour extracted

1198891is a Euclidean distance between the middle point on the

line 119875(119894 + 119896) 119875(119894 minus 119896) and the point 119862 and 1198892is a Euclidean

distance between the119875(119894) and the point119862We define the rulesas follows

If 1198891gt 1198892 119875(119894) is a concave point between fingers

If 1198891

lt 1198892 119875(119894) is a convex point of finger that is

fingertip

Using the above defined law the recognition result ofhandpolygon contour is showedby Figure 8 In the image theblue line presents the hand contour the red circles indicatethe fingertips (namely convex points) and the blue pointsshow the concave points The bigger red circle is the centralpoint of the contour and it is calculated by the followingequations

119898119894119895= sum

119909119910

(arry (119909 119910) sdot 119909119894 sdot 119910119895)

119909 =

11989810

11989800

119910 =

11989801

11989800

(5)


P(i + k)

P(u)

120572d1

d2

P(i)

C

rarr

V1 rarr

V2

P(i minus k)

(a) Hand concave point

P(i + k)

120572

d1d2

P(i)

C

rarr

V1

rarr

V2

P(i minus k)

P(u)

(b) Hand convex point

Figure 7 Hand polygon contour vertex recognition

Figure 8 Polygon fitting of hand gesture

However for a practical application the method cannotquite express all numbers of hand gestures only to count theabove point especially in the situation that the depth mapcaptured by Kinect is dithering To solve the problem thepaper adds an auxiliary point which locates at the point awayfrom the contour central point 50 pixels and then counts thenumber of the convex points above the auxiliary point Therecognition results are shown in Tables 1 and 2 From thetwo tables we can determine the criterion that suits to eachnumber and then recognize the number via the criterion

33 Dynamic Hand Gesture Recognition For dynamic handgesture recognition this paper used dynamic time warp-ing (DTW) [21] to detect hands in complex and clutteredbackground DTW is dynamic programming and needs a lotof sample templates to train According to this frameworkthe gesturing hand is detected using a motion detectionmethod based on frame difference and depth segmentationTrajectory recognition instead is performed using the nearestneighbor classification framework which uses the similaritymeasure returned by DTW We set four samples of handgesture that is wave click going left and going rightHowever it is unnecessary to obtain and analyze the humanbody data firstly and then determine the hand position

Table 1 Different number recognition results

Number Extracted handcontour convex

Recognized handpolygon convex

0 3 01 2 or 3 12 3 23 3 or 4 34 4 45 5 or 6 4

Table 2 Different number recognition based on auxiliary point


Convex points based onauxiliary point

0 3 01 2 or 3 12 3 23 3 or 4 3 or 44 4 45 5 or 6 5

In NiTE library a startGestureDetection function in classHandTracker can detect simple hand gestures for examplewave whose parameter is the hand skeleton joint While thehandmoving from right to left or from left to right trajectoryrecognition is required At such time we use DTWalgorithmto recognize the hand gesture Most of the existing methodsfocus either on static signs or on dynamic gestures As a resultbeing able to classify both types of hand expressivity allowsfor understanding the most of hand gesture

4 Experimental Results

In this section many tests are made to validate the proposedstatic hand gesture recognitionmethod and the Kinect-basedvision system for rescue system in Figure 9The vision systemof rescue system is to combine the proposed static handgesture and dynamic hand gesture recognition method Theusersrsquo gestures captured from the Kinect sensor are compared


Figure 9 Kinect-based vision system

with these gestures defined before If the captured gestureis in the threshold range of the defined gesture the gesturecan be recognizedThen the corresponding command is sentto control the movement direction of rescue robots throughWiFi network

Since each number presented by hand gesture has thedifferent convex points and concave points this paper makesuse of the feature to recognize the number The resultsare shown in Figure 10 Besides the proposed method ofhand polygon contourwith119870-curvature computing is furthercompared with two different recognition methods that isan artificial neural network and template matching The testadopted a BP network with 11 input nets 8 hidden nets and5 output nets and a template matching based on correlationcoefficient In the test 90 depth images of hand gesture arecaptured at different hand angle and distance from Kinectamong which 30 images are sample data and 60 images areused to train BP network As the second column shown inTable 3 the recognition rate of polygon contour with 119870-curvature computing is higher than that of the templatematching method Although both of them use less runtimethe templatematchingmethodovermuch relies on the gesturetemplate and so has lower recognition rate The BP networkwith 60 pieces of train data has higher recognition rate thanthe polygon contour method as shown in the third columnbracket of Table 3 while the BP network with 30 pieces oftrain data outside the bracket has lower value The higherrecognition rate of BP network is at the cost of a large numberof samples and too much runtime which is unfit for real-time recognition and further control Thus the proposedpolygon contour with 119870-curvature computing has higherrecognition rate than the BP network and template matchingwhile being without much sacrifice of runtime It validatesthat the proposed method for hand gesture recognition isfeasible to implement the real-time HMI control

Figure 11 is an interface for HMI by which users cansend a recognition result to the control system of rescuerobot The recognition of hand gesture is to mainly relyon depth information and the RGB image provides somecomplementary information for hand contour extractionThus as shown in Figure 11 it is achieved that even though

Table 3 Recognition rate of different recognition methods

Number Polygon contourrecognition BP network Template

matching0 100 100 (100) 971 93 90 (100) 832 87 77 (90) 673 77 67 (83) 474 70 64 (80) 435 76 63 (77) 40

the environment is low illuminous or darker the recognitionresults of hand gesture are still better

5 Conclusions

This paper applies Kinect to a control platformof rescue robotand presents Kinect-based vision system for undergroundmining tunnel which is low illuminous For worse air envi-ronment the voice recognition does not work for survivorsin underground mine tunnel because these survivors cannottake a breath Also for low illuminous environment the RGBimage information cannot be extracted easily Kinect hasthe ability to capture the target information from its depthsensor thereby it can work even in darkness environmentThis paper proposes a static hand gesture recognitionmethodinvolving hand contour extraction and 119870-curvature basedhand polygon convex detection Furthermore the interface ofimage process combines the proposed static hand recognitionand the DTW algorithm for dynamic hand recognitionFurthermore a comparison test is made among119870-curvaturepolygon contour method BP neural network and templatematching which illustrates that the proposed method hashigher recognition rate while being without much sacrificeof runtime Finally the proposed static hand recognition anddynamic hand recognition are carried out to recognize thehand gestures about five numbers and four dynamic gesturesExperimental results validate the Kinect-based vision system


0

(a)

1

(b)

2

(c)

3

(d)

4

(e)

5

(f)

Figure 10 Static hand gesture recognition result

Wave 3

(a) Vision system with light

Wave 3

(b) Vision system with only PC screen light

Figure 11 Interface of Kinect-based vision system of rescue robot


In conclusion it can provide a feasible vision system of rescuerobot in the low illuminous underground environment

However there are some constraints when applyingthe proposed algorithm such as the hand position rangebeing limited inevitable holes and noises around objectrsquosboundaries in the obtained images Hence a vision systemwith better performance in underground rescue robot needsto be explored more extensively Further improved Kinect-based image process schemes ormore efficient algorithms areexpected to appear in future

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported in part by the National NaturalScience Foundation of China (Grant 51305338) the Sci-entific Research Plan Project of Shaanxi Province (Grant2013JK1004) and Technology Innovation Project of ShaanxiProvince (Grant 2013KTCL01-02)

References

[1] J Suarez and R R Murphy ldquoUsing the Kinect for searchand rescue roboticsrdquo in Proceedings of the IEEE InternationalSymposium on Safety Security and Rescue Robotics (SSRR rsquo12)College Station Tex USA November 2012

[2] D Wei and W Ge ldquoResearch on one bio-inspired jumpinglocomotion robot for search and rescuerdquo International Journalof Advanced Robotic Systems vol 11 no 10 pp 1ndash10 2014

[3] Z Yao and Z Ren ldquoPath planning for coalmine rescue robotbased on hybrid adaptive artificial fish swarm algorithmrdquoInternational Journal of Control and Automation vol 7 no 8pp 1ndash12 2014

[4] A Poncela and L Gallardo-Estrella ldquoCommand-based voiceteleoperation of a mobile robot via a human-robot interfacerdquoRobotica vol 33 no 1 pp 1ndash18 2015

[5] W-M Liu and L-HWang ldquoThe soccer robot the auto-adaptedthreshold value method based on HSI and RGBrdquo in Proceedingsof the 4th International Conference on Intelligent ComputationTechnology and Automation (ICICTA rsquo11) vol 1 pp 283ndash286March 2011

[6] S I A Shah E N Johnson A Wu and Y Watanabe ldquoSinglesensor-based 3D feature point location for a small flying robotapplication using one camerardquo Proceedings of the Institution ofMechanical Engineers Part G Journal of Aerospace Engineeringvol 228 no 9 pp 1668ndash1689 2014

[7] R Lagisetty N K Philip R Padhi and M S Bhat ldquoObjectdetection and obstacle avoidance for mobile robot using stereocamerardquo in Proceedings of the IEEE International Conferenceon Control Applications pp 605ndash610 Hyderabad India August2013

[8] Y You T Tang and Y Wang ldquoWhen arduino meets Kinectan intelligent ambient home entertainment environmentrdquo inProceedings of the 6th International Conference on IntelligentHuman-Machine Systems and Cybernetics (IHMSC rsquo14) vol 2pp 150ndash153 Hangzhou China August 2014

[9] V A Prabhu A Tiwari W Hutabarat and C Turner ldquoMon-itoring and digitising human-workpiece interactions during amanual manufacturing assembly operation using Kinect TMrdquoKey Engineering Materials Advanced Design and Manufacturevol 572 no 1 pp 609ndash612 2014

[10] Y Yang F Pu Y Li S Li Y Fan and D Li ldquoReliability andvalidity of kinect RGB-D sensor for assessing standing balancerdquoIEEE Sensors Journal vol 14 no 5 pp 1633ndash1638 2014

[11] K Qian H Yang and J Niu ldquoDeveloping a gesture basedremote human-robot interaction system using kinectrdquo Interna-tional Journal of Smart Home vol 7 no 4 pp 203ndash208 2013

[12] N Abdelkrim K Issam K Lyes et al ldquoFuzzy logic controllersfor mobile robot navigation in unknown environment usingKinect sensorrdquo in Proceedings of the International Conference onSystems Signals and Image Processing pp 75ndash78 2014

[13] B-F Wu C-L Jen W-F Li T-Y Tsou P-Y Tseng and K-THsiao ldquoRGB-D sensor based SLAM and human tracking withBayesian framework forwheelchair robotsrdquo inProceedings of theInternational Conference on Advanced Robotics and IntelligentSystems (ARIS rsquo13) pp 110ndash115 IEEE Tainan TaiwanMay-June2013

[14] G Du and P Zhang ldquoMarkerless human-robot interface fordual robot manipulators using Kinect sensorrdquo Robotics andComputer-Integrated Manufacturing vol 30 no 2 pp 150ndash1592014

[15] K Buckstone L Judd N Orlowski M Tayler-Grint RWilliams and E Zauls ldquoWarwickmobile robotics urban searchand rescue robotrdquo Tech Rep University ofWarwick CoventryUK 2012

[16] J L Raheja A Chaudhary andK Singal ldquoTracking of fingertipsand centers of palm using kinectrdquo in Proceedings of the 3rd Inter-national Conference on Computational Intelligence Modellingand Simulation (CIMSim rsquo11) pp 248ndash252 September 2011

[17] T TThanh F Chen K Kotani andH-B Le ldquoExtraction of dis-criminative patterns from skeleton sequences for human actionrecognitionrdquo in Proceedings of the IEEE RIVF InternationalConference on Computing and Communication TechnologiesResearch Innovation and Vision for the Future (RIVF 12) pp1ndash6 Ho Chi Minh City Vietnam February-March 2012

[18] P Doliotis V Athitsos D I Kosmopoulos and S PerantonisldquoHand shape and 3D pose estimation using depth data froma single cluttered framerdquo in Proceedings of the InternationalSymposium on Visual Computing (ISVC rsquo12) Crete Greece July2012 vol 7431 of Lecture Notes in Computer Science pp 148ndash158Springer Berlin Germany 2012

[19] T Wan Y Wang and J Li ldquoHand gesture recognition systemusing depth datardquo in Proceedings of the 2nd International Con-ference onConsumer Electronics Communications andNetworks(CECNet rsquo12) pp 1063ndash1066 Yichang China April 2012

[20] F Pedersoli S Benini N Adami and R Leonardi ldquoXKin anopen source framework for hand pose and gesture recognitionusing kinectrdquoTheVisual Computer vol 30 no 10 pp 1107ndash11222014

[21] N Chahal and S Chaudhury ldquoHigh quality depth map estima-tion by kinect upsampling and hole filling using RGB featuresand mutual informationrdquo in Proceedings of the 4th NationalConference on Computer Vision Pattern Recognition ImageProcessing and Graphics (NCVPRIPG rsquo13) pp 1ndash4 December2013

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014


Active and Passive Electronic Components

Control Scienceand Engineering

Journal of



RotatingMachinery


Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design



Shock and Vibration


Civil EngineeringAdvances in

Acoustics and VibrationAdvances in



Electrical and Computer Engineering

Journal of

Advances inOptoElectronics


Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of


Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014


Chemical EngineeringInternational Journal of Antennas and

Propagation




Navigation and Observation



DistributedSensor Networks



USB

Kinect camera

CANopen

CANopen

Actuator

Tracked mobile robot

Figure 1 Architecture of vision system of rescue robot

Kinect is a new integrated 3D sensor capturing depthimage andRGB image It has beenwidely used inmany differ-ent fields such as entertainment [8] industry automation [9]medical applications [10] and remote control of robot [11]because of its inexpensive depth sensingThe complementarynature of the depth and visual RGB information in Kinectsensor opens up new opportunities to solve fundamentalproblems in human behavior recognition for HMI On thecondition that the recognition environment is very dark thetarget recognition will be conventionally out of functionWhile a depth image captured by Kinect is not affected bylights the target recognition cannot be affected by complexbackground or lights in the field of object recognition Hencethe target or hand gesture can be recognized in the dark orin places of complex images with Kinect Furthermore theordinary camera is unfit for working underground whereits image may lose detailed information It is because theenvironment of underground mine tunnel is low illuminouseven dark in part of tunnel Therefore the Kinect is a viablesensor to use for HMI of rescue robot underground miningtunnel

However few researches are about the application ofKinect to a deployed rescue robot though theKinect has beenused extensively in robot for navigation [12] environmentmapping [13] and object manipulation [14] One undergrad-uate team from the University of Warwick installed Kinecton robot to perform 3D SLAM in simulated disaster settingsnamely the RoboCup Rescue competition [15] But theenvironment is indoors and moreover the highly structuredRoboCup course (made largely of diffuse plywood) lendsitself easily to the task Suarez and Murphy [1] reviewedKinectrsquos use in rescue robotics and similar applications andhighlighted the associated challenges but did not detail theKinect operation technology on rescue robot for hand ges-ture navigation remote control and so forth Developmentof Kinect makes some classical recognition methods andalso introduces some new research technologies to improvehuman behavior recognition So far Kinect-based full-body3D motion estimation has been exploited to track bodyjoints [8ndash10] while accurately recognizing small gestures inparticular hand motions is still hard work Many researches

mainly include hand gesture and body parts recognition andposture tracking Raheja et al [16] used Kinect to trackfingertip and palm via its SDK apps Thanh et al [17] teamcaptured 3D histogram data of finger joint information fromKinect images and recognized hand gesture with TF-IDFalgorithm Doliotis et al [18] proposed a clutter-toleranthand segmentation algorithm where 3D pose estimation wasformulated as a retrieval problem but the local regressions forsmoothing the sequence of hand segmentationwidths did notallow the algorithm to achieve real-time performance Withthe aim to extend the set of recognizable gestures Wan et al[19] proposed a method to recognize five gestures usingKinect but one limitation of this work is the impossibilityto detect vertical gestures Pedersoli et al [20] displayeda unified open-source framework for real-time recognitionof both static hand-poses and dynamic hand gestures butrelying only on depth information Based on the above itcan be attained that hand gesture recognition depends moreon depth map while body posture is liable to focus on bodyskeleton joint data captured from Kinect

Therefore the paper proposes a Kinect-based visionsystem for rescue robot underground mining tunnel andachieves hand gesture recognition combining depth map andskeleton joint information

This paper is organized as follows Section 2 introducesthe architecture of Kinect-based vision system of rescuerobot Section 3 presents the hand gesture recognition in lowilluminous environment using OpenNI and NiTE library forWindowsThe experimental results are analyzed in Section 4Finally the conclusion and some future work are given inSection 5

2 Architecture of Kinect-Based Vision System

The Kinect-based vision system architecture of rescue robotfor gesture recognition is shown in Figure 1 The system iscomposed of motor actuators Kinect PC and a platform oftrackedmobile robot used undergroundmine tunnel Kinectdirectly connected to host computer through USB port isactually a 3D somatosensory camera with a low price Onthe Kinect there is an Infrared (IR) Projector a Color (RGB)


Start


functions



Static hand gesture

recognition


recognition


Stop recognizing

Close Kinect

End

Yes

No

Yes

No

Yes





















119894and



119894sdot 119884 + 119862

119894sdot 119883 (1)


119889119895= depth [119895] ≫ 3 (2)


119904119889 =

1 119889119895le 119889119894+ 119904

minus1 119889119895gt 119889119894+ 119904

(3)







5

Palm center point





1198811is a



1198811and



cos120572 =

1198811sdot

1198812

10038161003816100381610038161003816

1198811sdot

1198812

10038161003816100381610038161003816

(4)







If 1198891


fingertip


119898119894119895= sum

119909119910

(arry (119909 119910) sdot 119909119894 sdot 119910119895)

119909 =

11989810

11989800

119910 =

11989801

11989800

(5)


P(i + k)

P(u)

120572d1

d2

P(i)

C

rarr

V1 rarr

V2

P(i minus k)


P(i + k)

120572

d1d2

P(i)

C

rarr

V1

rarr

V2

P(i minus k)

P(u)









0 3 01 2 or 3 12 3 23 3 or 4 34 4 45 5 or 6 4




0 3 01 2 or 3 12 3 23 3 or 4 3 or 44 4 45 5 or 6 5











matching0 100 100 (100) 971 93 90 (100) 832 87 77 (90) 673 77 67 (83) 474 70 64 (80) 435 76 63 (77) 40


5 Conclusions



0

(a)

1

(b)

2

(c)

3

(d)

4

(e)

5

(f)


Wave 3


Wave 3








Acknowledgments


References
























RoboticsJournal of





Journal of



RotatingMachinery





VLSI Design



Shock and Vibration







Journal of



Volume 2014


SensorsJournal of





Propagation










Start


functions



Static hand gesture

recognition


recognition


Stop recognizing

Close Kinect

End

Yes

No

Yes

No

Yes





















119894and



119894sdot 119884 + 119862

119894sdot 119883 (1)


119889119895= depth [119895] ≫ 3 (2)


119904119889 =

1 119889119895le 119889119894+ 119904

minus1 119889119895gt 119889119894+ 119904

(3)







5

Palm center point





1198811is a



1198811and



cos120572 =

1198811sdot

1198812

10038161003816100381610038161003816

1198811sdot

1198812

10038161003816100381610038161003816

(4)







If 1198891


fingertip


119898119894119895= sum

119909119910

(arry (119909 119910) sdot 119909119894 sdot 119910119895)

119909 =

11989810

11989800

119910 =

11989801

11989800

(5)


P(i + k)

P(u)

120572d1

d2

P(i)

C

rarr

V1 rarr

V2

P(i minus k)


P(i + k)

120572

d1d2

P(i)

C

rarr

V1

rarr

V2

P(i minus k)

P(u)









0 3 01 2 or 3 12 3 23 3 or 4 34 4 45 5 or 6 4




0 3 01 2 or 3 12 3 23 3 or 4 3 or 44 4 45 5 or 6 5











matching0 100 100 (100) 971 93 90 (100) 832 87 77 (90) 673 77 67 (83) 474 70 64 (80) 435 76 63 (77) 40


5 Conclusions



0

(a)

1

(b)

2

(c)

3

(d)

4

(e)

5

(f)


Wave 3


Wave 3








Acknowledgments


References
























RoboticsJournal of





Journal of



RotatingMachinery





VLSI Design



Shock and Vibration







Journal of



Volume 2014


SensorsJournal of





Propagation






















119894and



119894sdot 119884 + 119862

119894sdot 119883 (1)


119889119895= depth [119895] ≫ 3 (2)


119904119889 =

1 119889119895le 119889119894+ 119904

minus1 119889119895gt 119889119894+ 119904

(3)







5

Palm center point





1198811is a



1198811and



cos120572 =

1198811sdot

1198812

10038161003816100381610038161003816

1198811sdot

1198812

10038161003816100381610038161003816

(4)







If 1198891


fingertip


119898119894119895= sum

119909119910

(arry (119909 119910) sdot 119909119894 sdot 119910119895)

119909 =

11989810

11989800

119910 =

11989801

11989800

(5)


P(i + k)

P(u)

120572d1

d2

P(i)

C

rarr

V1 rarr

V2

P(i minus k)


P(i + k)

120572

d1d2

P(i)

C

rarr

V1

rarr

V2

P(i minus k)

P(u)









0 3 01 2 or 3 12 3 23 3 or 4 34 4 45 5 or 6 4




0 3 01 2 or 3 12 3 23 3 or 4 3 or 44 4 45 5 or 6 5











matching0 100 100 (100) 971 93 90 (100) 832 87 77 (90) 673 77 67 (83) 474 70 64 (80) 435 76 63 (77) 40


5 Conclusions



0

(a)

1

(b)

2

(c)

3

(d)

4

(e)

5

(f)


Wave 3


Wave 3








Acknowledgments


References
























RoboticsJournal of





Journal of



RotatingMachinery





VLSI Design



Shock and Vibration







Journal of



Volume 2014


SensorsJournal of





Propagation










5

Palm center point





1198811is a



1198811and



cos120572 =

1198811sdot

1198812

10038161003816100381610038161003816

1198811sdot

1198812

10038161003816100381610038161003816

(4)







If 1198891


fingertip


119898119894119895= sum

119909119910

(arry (119909 119910) sdot 119909119894 sdot 119910119895)

119909 =

11989810

11989800

119910 =

11989801

11989800

(5)


P(i + k)

P(u)

120572d1

d2

P(i)

C

rarr

V1 rarr

V2

P(i minus k)


P(i + k)

120572

d1d2

P(i)

C

rarr

V1

rarr

V2

P(i minus k)

P(u)









0 3 01 2 or 3 12 3 23 3 or 4 34 4 45 5 or 6 4




0 3 01 2 or 3 12 3 23 3 or 4 3 or 44 4 45 5 or 6 5











matching0 100 100 (100) 971 93 90 (100) 832 87 77 (90) 673 77 67 (83) 474 70 64 (80) 435 76 63 (77) 40


5 Conclusions



0

(a)

1

(b)

2

(c)

3

(d)

4

(e)

5

(f)


Wave 3


Wave 3








Acknowledgments


References
























RoboticsJournal of





Journal of



RotatingMachinery





VLSI Design



Shock and Vibration







Journal of



Volume 2014


SensorsJournal of





Propagation










P(i + k)

P(u)

120572d1

d2

P(i)

C

rarr

V1 rarr

V2

P(i minus k)


P(i + k)

120572

d1d2

P(i)

C

rarr

V1

rarr

V2

P(i minus k)

P(u)









0 3 01 2 or 3 12 3 23 3 or 4 34 4 45 5 or 6 4




0 3 01 2 or 3 12 3 23 3 or 4 3 or 44 4 45 5 or 6 5











matching0 100 100 (100) 971 93 90 (100) 832 87 77 (90) 673 77 67 (83) 474 70 64 (80) 435 76 63 (77) 40


5 Conclusions



0

(a)

1

(b)

2

(c)

3

(d)

4

(e)

5

(f)


Wave 3


Wave 3








Acknowledgments


References
























RoboticsJournal of





Journal of



RotatingMachinery





VLSI Design



Shock and Vibration







Journal of



Volume 2014


SensorsJournal of





Propagation
















matching0 100 100 (100) 971 93 90 (100) 832 87 77 (90) 673 77 67 (83) 474 70 64 (80) 435 76 63 (77) 40


5 Conclusions



0

(a)

1

(b)

2

(c)

3

(d)

4

(e)

5

(f)


Wave 3


Wave 3








Acknowledgments


References
























RoboticsJournal of





Journal of



RotatingMachinery





VLSI Design



Shock and Vibration







Journal of



Volume 2014


SensorsJournal of





Propagation










0

(a)

1

(b)

2

(c)

3

(d)

4

(e)

5

(f)


Wave 3


Wave 3








Acknowledgments


References
























RoboticsJournal of





Journal of



RotatingMachinery





VLSI Design



Shock and Vibration







Journal of



Volume 2014


SensorsJournal of





Propagation














Acknowledgments


References
























RoboticsJournal of





Journal of



RotatingMachinery





VLSI Design



Shock and Vibration







Journal of



Volume 2014


SensorsJournal of





Propagation











RoboticsJournal of





Journal of



RotatingMachinery





VLSI Design



Shock and Vibration







Journal of



Volume 2014


SensorsJournal of





Propagation









Research Article Kinect-Based Vision System of Mine Rescue...

Documents

Transcript of Research Article Kinect-Based Vision System of Mine Rescue...