Dataset Generation and Annotation for Unconstrained Facial ... · Facial Recognition REBECCA...
Transcript of Dataset Generation and Annotation for Unconstrained Facial ... · Facial Recognition REBECCA...
Dataset Generation and
Annotation for Unconstrained
Facial Recognition
REBECCA ERIKSON
Pacific Northwest National Laboratory
Global Identity Summit 2015 – Tampa, Florida
PNNL-SA-112974
Unconstrained Facial Recognition
Determine algorithm performance against challenging unconstrained face
data in video and still imagery.
DHS sponsored
NIST performing T&E of algorithm performance
PNNL providing dataset generation and annotation
Diverse face presentations
Statistically significant quantity of faces
Operationally relevant confounders
FIVE – Face In Video Evaluation (2013 – 2015)
CHEXIA – Child Exploitation Image Analytics (2015 and continuing)
October 8, 2015 Global Identity Summit 2015 2 PNNL-SA-112974
Face In Video Evaluation (FIVE)
Operationally realistic corpus of video data to support the evaluation and enhancement of facial recognition systems technology
PNNL role players with public crowds in 5 indoor live events
1) one-way crowd flow
2) two-way crowd flow
3) linear and serpentine queues
147 hours of video data
11 cameras – consumer grade with SD memory
Pixels on target variations
Varied pitch and yaw
14,401 annotated video segments
24 fps at 1920 x 1080
H.264 .mp4
Collected 2153 still photographs for “Watch List” of the 64 unique role players
October 8, 2015 Global Identity Summit 2015 3 PNNL-SA-112974
Traditional Mugshots
October 8, 2015 Global Identity Summit 2015 4
!
!
!
!
!
!
!
!
!
!
!
!
!
!!!
!
!
!! ! !
PNNL-SA-112974
Watchlist Composition
October 8, 2015 5
16MP mugshots at 14 angles, repeated
with eyeglasses if available
Global Identity Summit 2015 PNNL-SA-112974
Watchlist Composition
October 8, 2015 6
976 images of 64 individuals spanning up to 20 years for some individuals
600 ppi scans of ID images
Low resolution “port of entry” for 2 angles
Global Identity Summit 2015
PNNL-SA-112974
Global Identity Summit 2015
Watchlist vs. Performance
October 8, 2015 7
RHS – Image taken same day LHS – 10 years younger, no facial hair
PNNL-SA-112974
Child Exploitation Image Analytics
(CHEXIA)
Seized images and videos from criminal investigations of child
exploitation
Annotation of Face, Text, Tattoos and Patterns
>110,000 files complete to date
Face
Newborn to mature adult, including aging
No watchlist, mugshot, or guarantee of cooperative image of an individual
Variations in face quality including:
Expression
Angle
Light
Occlusion
Resolution
Blur
October 8, 2015 Global Identity Summit 2015 8 PNNL-SA-112974
Child Exploitation Image Analytics
(CHEXIA)
October 8, 2015 Global Identity Summit 2015 9
Annotation includes:
Bounding Box for each face
Eye/Nose Location
Eyes: Open, Closed, Sunglasses, Eyeglasses
Gender, if known
Age range in 3-5 year increments
Quality of Face Presentation
Reason for reduced Quality
192 Possible categories for Face
3 quality ranks
7 categories
8 additional confounders
PNNL-SA-112974
Child Exploitation Image Analytics
(CHEXIA)
October 8, 2015 Global Identity Summit 2015 10
Quality of Face Presentation : Ranked 3-1
Reason for reduced Quality
Expression, Angle, Lighting, Occlusion, Resolution and Blur
PNNL-SA-112974
Aging
October 8, 2015 Global Identity Summit 2015 11
CHEXIA data can have age ranges for an individual from infant to
adolescence
Example of aging in an adult for reference only
PNNL-SA-112974
Expression
October 8, 2015 Global Identity Summit 2015 12
Wide variations in expression from image to image greatly impact the
quality of the face presented.
PNNL-SA-112974
Angle
October 8, 2015 Global Identity Summit 2015 13
CHEXIA data also includes many pictures with yaw-pitch-roll
combinations
PNNL-SA-112974
Lighting
October 8, 2015 Global Identity Summit 2015 14
Reduced quality due to lighting can manifest in noise, low light and
extreme angles.
PNNL-SA-112974
Occlusion
October 8, 2015 Global Identity Summit 2015 15
Occlusions from sunglasses, hats, hands, camera FOV, etc
PNNL-SA-112974
Resolution and Blur
October 8, 2015 Global Identity Summit 2015 16
Variations on blur, focus and lower quality resolution
PNNL-SA-112974
Summary
Testing algorithms against operationally relevant data is key
Results can help determine operation parameters in public venues
Light placement
Camera height, placement, and number of angles
Queue structure to increase high quality face presentations
Data used must include large amounts of data for each confounder
Critical to know algorithm performance limits for angle, lighting etc.
Critical to have industry addressing complex face presentations.
October 8, 2015 Global Identity Summit 2015 17 PNNL-SA-112974