ASL2TXT Converting sign language gestures from digital images to text George Corser.
-
Upload
percival-price -
Category
Documents
-
view
223 -
download
0
Transcript of ASL2TXT Converting sign language gestures from digital images to text George Corser.
ASL2TXT
Converting sign language gestures from digital images to text
George Corser
Presentation Overview
• Concept• Foundation: Barkoky & Charkari (2011)– Segmentation – Thinning
• My Contribution: Corser (2012)– Segmentation (similar to Barkoky)– CED: Canny Edge Dilation (Minus Errors)– Assumption: User trains his own phone
Concept
• Deaf and hearing people talking on the phone, each using their natural language
• Sign-activated commands like voice-activated
Situation: Drive Thru Window
1. Deaf person signs order2. Phone speaks order3. Confirmation on screen
Think:Stephen Hawking
Process Flow
• Requires several conversion processes• Many have been accomplished• Remaining: ASL2TXT
Goal: Find an Algorithm
• Find an image processing algorithm that recognizes ASL alphabet
= AWeb site
Barkoky: Segmentation & Thinning
Barkoky countsendpoints to
determine sign(doesn’t work for ASL)
Barkoky ProcessSegmentation1. Capture RGB image2. Rescale3. Extract using colors4. Reduce noise5. Crop at wrist6. Result: hand segment
Thinning7. Input: hand segment8. Apply thinning9. Find endpoints, joints10. Calculate lengths11. Clean short lengths12. Identify gesture by
counting endpoints
1. Capture RGB Image2. Rescale
% ---------- 1. Capture RGB imagea = imread('DSC04926.JPG');figure('Name','RGB image'),imshow(a);
% ---------- 2. Rescale image to 205x154a10 = imresize(a, 0.1);figure('Name','Rescaled image'),imshow(a10);
3. Extract Hand Using Colors
% ---------- 3. Extract hand using colorabw10 = zeros(205,154,1);for i=1:205, for j=1:154, if a10(i,j,2)<140 && a10(i,j,3)<100,
abw10(i,j,1)=255; end; end; end;figure('Name','Extracted'),imshow(abw10);
Note: Color threshold codediffers from Barkoky
Colors: Training Set Histograms
Colors: Training Set (2)
Excel
Red Green Blue
Colors: Test Set Histograms
4. Reduce Noise
% ---------- 4. Reduce noisefor i=2:204, for j=1:154, if abw10(i-1,j,1)==0
if abw10(i+1,j,1)==0, abw10(i,j,1)=0; end; end;
if abw10(i-1,j,1)==255 if abw10(i+1,j,1)==255,
abw10(i,j,1)=0; end; end;end; end;abw10 = imfill(abw10,'holes');
5. Identify Wrist Position
% ---------- 5. Identify wrist positionfor i=204:-1:1, for j=1:154,
if abw10(i,j,1)==255, break; end; end; if j ~= 154 && abw10(i+1,j,1)~=255, wristi=i+1; wristj=j+1; break; end; end;
Wrist Detection
• Algorithm searches bottom-to-top of image • Finds a leftmost white pixel above black pixel• Sets wrist position SE of found white pixel
Corser: Segmentation & CED
• Segmentation (similar to Barkoky)– Color threshold technique slightly different– American Sign Language (ASL) alphabet, not
Persian Sign Language (PSL) numbers• Image Comparison: Tried Several Methods– Full Threshold (Minus Errors)– Diced Segments (Minus Errors)– Endpoint Count Difference– CED: Canny Edge Dilation
ASL Training Set
Hit-or-miss: 23% Barkoky: 8%
ASL Test Set
MATLAB
A
A
B
B
C
C
D
D
E
E
F
F
G
G
H
H
I
I
J
J
K
K
L
L
M
M
N
N
O
O
P
P
Q
Q
R
R
S
S
T
T
U
U
V
V
W
W
X
X
Y
Y
Z
Z
Z
Hybrid Algorithm Example
% ---------- MATLAB Code -------------------matchtotal = 0;if abs(x10range - x20range) < 20, matchtotal = matchtotal + 10;end;if abs(y10range - y20range) < 20, matchtotal = matchtotal + 11;end;matchtotal = matchtotal - abs(h10 - h20);% ----- h10, h20 are vector magnitudes -----
Erosion Subtraction
Canny Edge
Canny Edge Dilation Code
% ---------- MATLAB Code -------------------
se = strel('disk',5);a10 = edge(a10,'canny');a20 = edge(a20,'canny');a10 = imdilate(a10,se);a20 = imdilate(a20,se);
% ----- Then calculate matches minus errors
Experimental Results
Technique CorrectFull Threshold (Minus Errors) 19% (27%)Diced Segments (Minus Errors) 23% (27%)Barkoky Endpoint Count Diff. 8%Hybrid - Height/Width/Endpoints 19%Erosion Subtraction 15%Canny Edge Dilation (Minus Errors) 12% (35%)
Disadvantages
• Dependent on lighting conditions• Fails with flesh-tone backgrounds• Requires calibration to a specific user• Limited applications: text messaging,
activation (“sign” similar to voice activation)• ASL numbers (A=10, D=1, O=0, V=2, W=6)• Alphabet is tiny portion of full translation:
complete translation maybe many years away
Future Work
• Barkoky claims flesh tones can be detected, but I have yet to replicate (even Barkoky changed his color detection scheme)
• Could write letter-by-letter algorithm• Could use range camera to compute distance
of finger instead of shape of hand• Motion analysis or edge count• Many possibilities… we’ve only just begun!
Cue: music http://www.youtube.com/watch?v=__VQX2Xn7tI
The End