Long Text Keystroke Biometrics Study
Gary Bartolacci, Mary Curtin, Marc Katzenberg, Ngozi Nwana
Sung-Hyuk Cha, Charles Tappert
(Software Engineering Project Team + DPS Student)
2
Keystroke Biometric
Biometrics important for security apps Advantage - inexpensive and easy to
implement, the only hardware needed is a keyboard
Disadvantage - behavioral rather than physiological biometric, easy to disguise
One of the least studied biometrics, thus good for dissertation studies
3
Focus of Study Previous studies mostly
concerned with short character string input Password hardening Short name strings
We focus on large text input 200 or more characters per sample
4
Focus of Study (cont)
Applications of interest Identification
1-of-n classification problem e.g., sender of inappropriate e-mail in a
business environment with a limited number of employees
Verification Binary classification problem, yes/no e.g., student taking online exam
5
Software Components
Raw Keystroke Data Capture over the Internet (Java applet)
Feature Extraction (SAS software) Classification (SAS software)
Training Testing
6
Keystroke Data Capture
(Java Applet)
Raw data recorded for each entry Key’s character Key’s code text equivalent Key’s location on keyboard
1 = standard, 2 = left, 3 = right Time key was pressed (msec) Time key was released (msec) Number of left, right, double mouse
clicks
9
Feature Extraction
10 Mean and 10 Std of key press durations 8 most frequent alphabet letters (e, a, r, i, o, t, n,
s) Space & shift keys
10 Mean and 10 Std of key transitions 8 most common digrams (in, th, ti, on, an, he, al,
er) Space-to-any-letter & any-letter-to-space
18 Total number of keypresses for Space, backspace, delete, insert, home, end,
enter, ctrl, 4 arrow keys, shift (left), shift (right), total entry time, left, right, & double mouse clicks
10
Feature Extraction Preprocessing
Outlier removal Remove samples > 2 std from mean Prevents skewing of feature
measurements caused by pausing of the keystroker
Standardization x’ = (x - xmin) / (xmax - xmin) Scales to range 0-1 to give roughly equal
weight to each feature
11
Sample Datasets
name samplenum txtsource e_std e_avg a_std a_avg in_std in_avg rightshift_pressesleftshift_pressesRSauther 1 fable1 23.71617 97.86667 31.32635 114.9286 11.31371 117 11 0RSauther 2 fable1 25.57777 96.67797 31.09691 105.2931 32.52691 133 10 0RSauther 3 fable1 22.71698 98.20339 30.43853 112.1754 0.707107 156.5 12 0RSauther 4 fable1 20.09434 91.84483 32.09517 108.2182 10.6066 117.5 11 0RSauther 5 fable1 24.14769 93.77778 24.50619 112.5556 706.3997 656.5 10 0RSauther 6 fable1 25.2213 99.23333 27.47559 115.0357 188.0904 289 10 0RSauther 7 fable1 28.10179 96.33898 33.42515 116.2105 77.78175 211 10 0
Prior to Standardization
name samplenum txtsource e_std e_avg a_std a_avg in_std in_avg rightshift_pressesleftshift_pressesRSauther 1 fable1 0.559895 0.269287 0.332531 0.311489 0.015224 0.286847 0.846154 0RSauther 2 fable1 0.636784 0.25677 0.329981 0.218218 0.043768 0.307997 0.769231 0RSauther 3 fable1 0.518627 0.272833 0.322661 0.284839 0.000951 0.339061 0.923077 0RSauther 4 fable1 0.410305 0.205875 0.341078 0.246533 0.014272 0.287508 0.846154 0RSauther 5 fable1 0.577718 0.22623 0.256712 0.288518 0.950523 1 0.769231 0RSauther 6 fable1 0.622061 0.283678 0.289723 0.312526 0.253092 0.51421 0.769231 0RSauther 7 fable1 0.741032 0.2532 0.355863 0.323899 0.104662 0.411104 0.769231 0
After Standardization
12
Classification Identification
Nearest neighbor classifier using Euclidean distance
Input sample compared to every training sample
13
Experimental Design:Identification Experiment
8 subjects that know the purpose of exp. Training – 10 reps of text a (approx. 600
char) Testing
10 reps of text a 10 reps of text b (same length as text a) 10 reps of text c (half length of text a)
14
Experimental Design: Instructions for Subjects
Subjects were told to input the data using their normal keystroke dynamics
Subjects were asked leave at least a day between entering samples
15
Experimental Design:Text a – about 600
characters This is an Aesop fable about the bat and the
weasels. A bat who fell upon the ground and was caught by a weasel pleaded to be spared his life. The weasel refused, saying that he was by nature the enemy of all birds. The bat assured him that he was not a bird, but a mouse, and thus was set free. Shortly afterwards the bat again fell to the ground and was caught by another weasel, whom he likewise entreated not to eat him. The weasel said that he had a special hostility to mice. The bat assured him that he was not a mouse, but a bat, and thus a second time escaped. The moral of the story: it is wise to turn circumstances to good account.
16
Expected Outcomes: Recognition Accuracy
Accuracy on text a > that on text b text a is the training text
Accuracy on text b > that on text c
text b is longer than text c Accuracy on texts a, b, c > arbitrary
text texts a, b, & c are similar, all Aesop
fables
17
Preliminary Results – Reduced Experiment
Reduced identification experiment Smaller text input
“The quick brown fox jumps over the lazy dog.”
Fewer subjects Three project team members
Fewer feature measurements Mean and std for “e” and “o” key press
durations Accuracy of 80%, which is promising
18
Results – Comparison to Same Text
Predicted
Actual
Prior to Standardization only yielded a 59% accuracy
100 % accuracy with standardization (76 out of 76)
Confusion Matrix of Results after Standardization
Su
bje
ct 1
Su
bje
ct 2
Su
bje
ct 3
Su
bje
ct 4
Su
bje
ct 5
Su
bje
ct 6
Su
bje
ct 7
Su
bje
ct 8
Subject 1 10 0 0 0 0 0 0 0 Freq100 0 0 0 0 0 0 0 %0 10 0 0 0 0 0 0 Freq0 100 0 0 0 0 0 0 %0 0 10 0 0 0 0 0 Freq0 0 100 0 0 0 0 0 %0 0 0 6 0 0 0 0 Freq0 0 0 100 0 0 0 0 %0 0 0 0 10 0 0 0 Freq0 0 0 0 100 0 0 0 %0 0 0 0 0 9 0 0 Freq0 0 0 0 0 100 0 0 %0 0 0 0 0 0 10 0 Freq0 0 0 0 0 0 100 0 %0 0 0 0 0 0 0 11 Freq0 0 0 0 0 0 0 100 %
Subject 3
Subject 4
Subject 1
Subject 2
Subject 7
Subject 8
Subject 5
Subject 6
19
Results – Comparison to Different Text of ~Equal Length
Predicted
Actual
Prior to Standardization only yielded a 38% accuracy
98.5 % accuracy with standardization (65 out of 66)
Confusion Matrix of Results after Standardization
Su
bje
ct 1
Su
bje
ct 2
Su
bje
ct 3
Su
bje
ct 4
Su
bje
ct 5
Su
bje
ct 7
Su
bje
ct 8
Subject 1 10 0 0 0 0 0 0 Freq100 0 0 0 0 0 0 %0 9 0 0 0 1 0 Freq0 90 0 0 0 10 0 %0 0 10 0 0 0 0 Freq0 0 100 0 0 0 0 %0 0 0 6 0 0 0 Freq0 0 0 100 0 0 0 %0 0 0 0 10 0 0 Freq0 0 0 0 100 0 0 %0 0 0 0 0 10 0 Freq0 0 0 0 0 100 0 %0 0 0 0 0 0 10 Freq0 0 0 0 0 0 100 %
Subject 7
Subject 8
Subject 5
Subject 3
Subject 4
Subject 1
Subject 2
20
Results – Comparison to Different Text of Shorter Length
Predicted
Actual
Prior to Standardization only yielded a 14% accuracy
97% accuracy with standardization (74 out of 76)
Confusion Matrix of Results after Standardization
Su
bje
ct 1
Su
bje
ct 2
Su
bje
ct 3
Su
bje
ct 4
Su
bje
ct 5
Su
bje
ct 6
Su
bje
ct 7
Su
bje
ct 8
Subject 1 10 0 0 0 0 0 0 0 Freq100 0 0 0 0 0 0 0 %0 9 0 0 0 0 1 0 Freq0 90 0 0 0 0 10 0 %0 0 10 0 0 0 0 0 Freq0 0 100 0 0 0 0 0 %0 0 0 5 0 0 1 0 Freq0 0 0 83.33 0 0 16.67 0 %0 0 0 0 10 0 0 0 Freq0 0 0 0 100 0 0 0 %0 0 0 0 0 10 0 0 Freq0 0 0 0 0 100 0 0 %0 0 0 0 0 0 10 0 Freq0 0 0 0 0 0 100 0 %0 0 0 0 0 0 0 10 Freq0 0 0 0 0 0 0 100 %
Subject 7
Subject 8
Subject 5
Subject 6
Subject 3
Subject 4
Subject 1
Subject 2
21
Conclusions
System is a viable means of differentiating between individuals based on typing patterns
Standardization is crucial to the accuracy of the system
It is likely that the shorter the text used for verification, the lower the accuracy
Decreasing # measurements used also decreases accuracy
Top Related