Defense applications white paper

10
Jul-2012 4984 El Camino Real Suite 205 Los Altos, CA 94022 T. 650-967-4067 [email protected] www.piXlogic.com Intelligent Image and Video Search for Defense Applications Government Reseller for piXlogic 10314 Thornbush Lane Bethesda, MD 20814 (301) 787-2989 A piXlogic White Paper Sponsored by Flex Analytics

Transcript of Defense applications white paper

Jul-2012

4984 El Camino Real Suite 205

Los Altos, CA 94022 T. 650-967-4067

[email protected] www.piXlogic.com

Intelligent Image and Video Search for Defense Applications

Government Reseller for piXlogic 10314 Thornbush Lane Bethesda, MD 20814 (301) 787-2989

A piXlogic White Paper

Sponsored by Flex Analytics

July 2012 pg. 2

July 2012 pg. 3

Contents

Introduction 3

Problem Statement 3

Previous Options 4

The piXserve Solution 5

Key Features of piXserve 5

Security Applications 8

Implementation 9

Summary 9

About piXlogic 10

Introduction

Images and videos have always been key

elements of intelligence and defense

operations. In recent years, the scope and

diversity of digital imagery has greatly

increased in every area: ground, satellite,

UAV, surveillance, broadcast, etc. The

volume of material being acquired and

stored is staggering, with no visible plateau

in sight. Traditional methods of organizing,

cataloguing, and distributing this material to

analysts and the war-fighter are becoming

impractical due to the scale involved. On

the other hand, timely access to nuggets of

vital information contained in images/videos

is key to operational success. The ability to

cross-correlate the information, whether it’s

being obtained from live sources or from

archived repositories, is more important than

ever.

In this environment, image/video search and

retrieval has become the new “must have”

element of any comprehensive solution.

Unfortunately, today’s image/video

management systems are not well suited to

help make sense of the data collected, and

can only provide, at best, very limited search

and retrieval capabilities.

Problem Statement

Most video management systems offer

limited options for automating processes

such as searching archived footage, or

generating alerts from live video. For the

most part, these features are either not

available, or only available in a very limited

sense. Often, a significant amount of

manpower is required to carry out even

simple search tasks. This is well known in

the field. Correlating visual data from

July 2012 pg. 4

different sources is another very

challenging task, mostly done

manually today. Automated change

detection is yet another largely

elusive goal.

Industry/government efforts during

the last few years have focused on

building infrastructure and have

resulted in great improvements in the

ability to acquire higher resolution

imagery/full motion video, moving

this material around the network

efficiently, and storing it. These are

great accomplishments, but by

themselves they are not enough.

Now is the time to leverage previous

investments and provide a much

needed level of automation so that

analysts can deal with the size and

scale of the problems they face.

However, for most solution

providers, this remains a significant

technical challenge.

Previous Options

When automated video analysis tools

are available, they tend to be single-

purpose with limited scope of

applicability and stringent operating

requirements. Consider the

following three examples:

Automated License Plate

Recognition: For most systems, the

hurdle is to know where the license

plate is in the image being analyzed.

To circumvent this problem, solution

providers either require the use of

specialized cameras (infrared) or that

the cameras be placed such that the

license plate to be recognized is

generally in the same location on the image.

Both of these requirements limit the scope

of applications possible with such systems.

Face Recognition: Much as in the ALPR

case, a big hurdle is to know where the face

to be measured is on the image. To solve

this, typical systems require that the distance

between the camera and the subject be

within a predefined range. Lighting

variations are also critical which is why the

more successful implementations are limited

to indoor, entry-way, type of set-ups.

Outdoor video in unconstrained

environments presents a challenge that is

outside the realm of most commercial

solutions available today.

Object Detection: The ability to

detect/recognize/search for specific objects

in a video or an image is not usually

available. Some attempts have been made

for video, but the methods used are overly

simplistic and unreliable. A typical

technique relies on “frame differencing” to

separate moving things from a stationary

background. The idea is simple but

unfortunately it only works in trivial

situations. If the camera is moving, the

background will move as well and frame

differencing techniques won’t work.

Turning off a light, a cloud passing in the

sky, a moving shadow, these are all things

that can yield undesired results. Even when

the background and the camera are

stationary, the amount of information that is

obtained is limited. If the camera is

calibrated, some guess about the size of the

object can be made and from this an

inference can be derived about what is in the

scene (perhaps an adult, may be not a dog),

but even this too can be quite unreliable (is

July 2012 pg. 5

it a dog, a tumbleweed, or a far away

person). Crowded environments

present a critical challenge to today’s

systems.

The piXserve Solution

piXserve is a general-purpose

image/video search and alerting

solution. Breakthrough technology

developed by piXlogic allows the

software to automatically “see” the

contents of an image/video frame

and create a searchable index and

uses this information so that users

can search and create alerts in a very

natural and logical way.

1. piXserve automatically

“segments” an image in a

way that discerns the

individual objects in the

image. It creates a

mathematical description of

the appearance of these

objects "on the fly", and

stores it as a searchable index

in a database.

2. piXserve reasons about what it "saw"

in the image and develops an initial

level of "understanding" about

content and context. Where it can, it

automatically creates searchable

"tags" for what it saw in the image

(piXlogic calls these tags "Notions").

For example, it can detect the

presence of things such as: sky,

vegetation, flower, face, building,

car, map, airplane, helicopter, etc.

3. piXserve uses all the information

calculated from the image to make

comparisons between a search image

and previously indexed

images/videos so that users can find

results that most closely match what

they are looking for.

4. piXserve can "see" not only visual

objects but also text strings that may

appear anywhere in the field of view

of the image. This text is also

indexed and made searchable.

piXserve works with text from many

languages (alphanumeric/latin-

character based languages, Japanese,

Korean, Chinese, etc.)

5. Depending on the quality of the

imagery involved and the type of

search being done, piXserve has

been designed to achieve accuracies

in excess of 85%.

Key Features of piXserve

Automatic Indexing

Point piXserve to a repository of

July 2012 pg. 6

images/video files or to a live

video feed, and automatically

index content. No manual

intervention or data entry

required.

Powerful Search

Through a web browser

interface, users login to

piXserve, connect to

available databases and

formulate search queries to

retrieve desired images/video

segments:

1. Use an arbitrary image

to search for

images/video segments

that contain the same or

similar items

2. U

s

e

the mouse to point to an area of

the query image to indicate

which specific item(s) should

be searched for.

3. Browse the contents of existing

databases, grab a frame “on the

fly” from a video that is

playing, and use that frame to

formulate a visual search

query.

4. Search images and videos by

object class ("Notion")

5. Type a text string to search

pictures/videos where that

string appears in the field of

view (a license plate, a street

sign, a name tag, etc.)

6. Search for faces of specific

individuals

7. Perform not only simple but

also complex multi-modal

searches. (Example: find video

sequences where something

like the bag in this picture

AND this face from this other

picture AND this text string I

just typed all appear in the field

of view at the same time.)

Use AND, OR, and NOT

operators to combine up to

6 criteria in a single query.

July 2012 pg. 7

8. Search by file name

9. Search by keyword or

other external metadata,

if available.

10. Submit sample images

of non-deformable

objects of interest and

automatically tag

images/video frames

when these items are

visible.

Powerful Automated Tagging

1. Automatically tag

images/video frames

with the name of

recognized

individuals that

appear therein

(automated face

naming).

2. Suggest keywords to

describe the contents

of a picture/video

frame (automated

keyword

recommendations)

3. Submit sample

images of non-

deformable objects of

interest and

automatically tag

images/video frames

when these items are

visible. (automated

2D-object detection

and naming)

Powerful Alerts

Create alert criteria just as you would

formulate a search query. piXserve-

ALERT keeps track of what

piXserve machines on the network

are indexing and when a match is

made consistent with what the user

specified, it generates a signal. The

user receives an e-mail with a link to

the alert results. A JMS (Java

Messaging Service) signal is also

generated to pass the alert on to other

systems and applications for further

action.

Powerful Metadata

The richness of metadata calculated

by piXserve about each image/video

frame processed (objects and tags),

can be exploited to enable

customized applications that are of

high value in a variety of settings

such as:

1. Automatic determination of

change detection when

videos taken at different

times from different angles

are compared.

2. Determining which portions

of a video archive contain

useful information, and

which could be safely deleted

to minimize storage

requirements.

Scaleable Architecture

piXserve is a multi-threaded, J2EE

scalable application that is suitable

for the most demanding

July 2012 pg. 8

implementations.

Web Services API

A REST-based API package

is available to support

integrations with third party

applications and workflow

environments.

Security Applications

If you are concerned with the cost,

speed, and accuracy of your video

investigative work, whether it be

forensic in nature or dealing with

live situations, then you should

consider piXserve as a “must-have”

add-on to your current system.

Conventional systems focus on

managing and manipulating cameras

and storage devices. Unfortunately,

they only provide limited capabilities

for searching the captured video:

time, date, motion, transaction

trigger…these are among the more

common set of options available.

While useful, these features alone are

inadequate to support a productive

workflow and significant manpower

effort is required even for the simpler

tasks. Common situations involve

several operators having to stare at a

bank of monitors for hours on end in

order to catch an event of interest, or

having to wade through hundreds of

hours of video from many cameras

looking for a specific event or trying

to correlate separate ones. These

situations are labor intensive, error

prone, and do not scale well.

piXserve extends the capabilities of today’s

systems by adding the ability to

automatically analyze the video that is being

collected and stored. These video streams

can be intercepted by piXserve and analyzed

for alerting purposes. Similarly, recorded

video can be analyzed, searched and

correlated using piXserve. The analytical

capabilities in piXserve support: facial

recognition, general purpose object

detection and recognition, text recognition,

license plate recognition, automatic tagging,

and more. All the indexing work is done

automatically, server side, in the

background. Users are then free to create

visually-based search criteria and navigate

the body of accumulated material. They can

do all of this “on the fly”, as they see fit at

the moment, based on whatever problem or

situation they are dealing with.

The piXserve search environment is

intuitive and productive, and the user

interface is through a web browser (Internet

Explorer, Mozilla Firefox, Safari, Google

Chrome, or equivalent). Users can drag-

and-drop a picture from anywhere to

formulate a similarity search query, or pause

a video while it’s playing, and use that

frame to create a new search criteria or

refine an existing one. This latter capability

greatly simplifies the discovery process

precisely in those situations when the user

isn’t quite sure what they are looking for and

are working in an investigative/exploratory

mode.

July 2012 pg. 9

Implementation

piXserve can process videos in a

variety of formats (MPEG-1, MPEG-

2, MPEG-4, H-263, H-264, etc.).

piXserve can also process still

images in over 90 different formats

(jpeg, tiff, png, bmp, psd, etc.)

piXserve can index both archived

video as well as live video broadcast

from Multicast IP cameras. piXserve

and piXserve-ALERT run on

standard 2-CPU rack servers (multi-

core Intel-Xeon processors or

equivalent), in a Windows Server

2003 or 2008 environment.

Customers typically choose Dell or

HP hardware for implementation.

piXserve is available in both x32 and

x64 bit versions.

In order to index archived video

piXserve requires that the storage

device be accessible via a network

share (Linux/Unix/Windows).

Further, the stored video should not

be in a proprietary, non-standard

format.

A single server can process large

amounts of archived material, or live

video from multiple feeds/sources.

The higher the number of cores on

the server, the higher the number of

hours of video per day that can be

processed by a single machine.

piXserve implementations can range

in size, from as little as a single

server to scalable multi-server and

distributed configurations. The

architecture of the product is such

that as the needs of the customer

grow, hardware can be added to

parallelize throughput and serve growing

needs.

The metadata created by piXserve is stored

in an RDBMs (Oracle or MS-SQL are

supported, PostgreSQL is bundled with

piXserve). The data and the piXserve output

can be integrated/correlated to that from

other systems that the customer may be

using. The alerting functionality is provided

by piXserve-ALERT. A single instance of

piXserve-ALERT can serve many users and

monitor potentially thousands of alert

criteria. Here too scaling is achieved by

adding additional ALERT servers. In

configurations were several hundreds or

thousands of individuals will be searching

piXserve generated data, the use of

piXserve-Enterprise Edition is

recommended.

Summary

Images and videos are a critical element of

defense and intelligence operations. It is

very difficult to deal with an ever-growing

amount of captured video without

July 2012 pg. 10

automation. The alternatives to

automation are expensive, time

consuming, and prone to errors. At

the same time, there is a lack of

suitable tools to provide a

meaningful level of real-world

automation.

piXserve provides an unparalleled

level of image analysis and

understanding. In a single tool it

provides capabilities that span:

object detection and recognition,

face recognition, license plate

recognition, text recognition,

automatic tagging, and more. In

each of these areas, piXserve

redefines the state of the art and can

help your meet the efficiency and

effectiveness goals that you have set

for yourself.

About piXlogic

piXlogic is a privately held company

located in Los Altos, CA, USA, the

heart of Silicon Valley. piXlogic is

an In-Q-Tel portfolio company (a

venture capital organization that

serves the needs of the US

Intelligence Community). The

company’s flagship products are

piXserve and piXserve-ALERT.

The software enables:

Content Discovery (find

pictures/videos that contain

specific objects, scenes, text, or

people of interest)

Content Auto-tagging (automatically

label an image/video)

Content Alerting (automatically inform

users when items of interest appear in a

live video stream or web crawl)

Content Change Detection

(automatically compare images and

video segments to detect changes at

the object level)

piXlogic serves the needs of government

and industrial customers. piXlogic sells its

products directly and through a network of

resellers in the US, the UK, Japan, Australia,

Argentina, Israel, and Italy.

Corporate piXlogic, Inc. 4984 El Camino Real Suite 205 Los Altos, CA 94022

T. +1-650-967-4067 E. [email protected] W. www.piXlogic.com

Flex Analytics is a systems integrator and

software reseller in the U.S. Intelligence

Community. It supports the sale,

implementation and customization of

piXserve in government installations.

Government Sales Flex Analytics LLC 10314 Thornbush Ln Bethesda, MD 20814

+1-301-787-2989 [email protected] www.flexanalytics.com