PROTECT — A Deployed Game-Theoretic System for Strategic ... · Coastal Security (PWCS) Mission....

Articles

96 AI MAGAZINE

The last several years have witnessed the successful appli-cation of game theory in allocating limited resources toprotect critical infrastructures. The real-world deployed

systems include ARMOR, used by the Los Angeles InternationalAirport (Jain et al. 2010) to randomize checkpoints of roadwaysand canine patrols; IRIS, which helps the U.S. Federal Air Mar-shal Service (Jain et al. 2010) in scheduling air marshals oninternational flights; and GUARDS (Pita et al. 2011), which isunder evaluation by the U.S. Transportation Security Adminis-tration to allocate the resources available for airport protection.Yet we as a community of agents and AI researchers remain inthe early stages of these deployments, and must continue todevelop core principles of applying game theory for security.

To this end, this article presents a new game-theoretic secu-rity application to aid the United States Coast Guard (USCG),called port tesilience operational/tactical enforcement to com-bat terrorism (PROTECT). The USCG’s mission includes mar-itime security of the U.S. coasts, ports, and inland waterways; asecurity domain that faces increased risks in the context ofthreats such as terrorism and drug trafficking. Given a particu-lar port and the variety of critical infrastructure that an adver-sary may attack within the port, USCG conducts patrols to pro-tect this infrastructure; however, while the adversary has theopportunity to observe patrol patterns, limited securityresources imply that USCG patrols cannot be at every location24 hours a day, 7 days a week. To assist the USCG in allocating

Copyright © 2012, Association for the Advancement of Artificial Intelligence. All rights reserved. ISSN 0738-4602

PROTECT — A Deployed Game-Theoretic System for Strategic Security

Allocation for the United States Coast Guard

Bo An, Eric Shieh, Rong Yang, Milind Tambe, Craig Baldwin, Joseph DiRenzo, Ben Maule, Garrett Meyer

n While three deployed applications of gametheory for security have recently been reported,we as a community of agents and AI researchersremain in the early stages of these deployments;there is a continuing need to understand thecore principles for innovative security applica-tions of game theory. Toward that end, this arti-cle presents PROTECT, a game-theoretic systemdeployed by the United States Coast Guard(USCG) in the Port of Boston for scheduling itspatrols. USCG has termed the deployment ofPROTECT in Boston a success; PROTECT iscurrently being tested in the Port of New York,with the potential for nationwide deployment.

PROTECT is premised on an attacker-defender Stackelberg game model and offers fivekey innovations. First, this system is a depar-ture from the assumption of perfect adversaryrationality noted in previous work, relyinginstead on a quantal response (QR) model ofthe adversary’s behavior — to the best of ourknowledge, this is the first real-world deploy-ment of the QR model. Second, to improve PRO-TECT’s efficiency, we generate a compact repre-sentation of the defender’s strategy space,exploiting equivalence and dominance. Third,we show how to practically model a real mar-itime patrolling problem as a Stackelberg game.Fourth, our experimental results illustrate thatPROTECT’s QR model more robustly handlesreal-world uncertainties than a perfect ration-ality model. Finally, in evaluating PROTECT,this article for the first time provides real-worlddata: comparison of human-generated versusPROTECT security schedules, and results froman Adversarial Perspective Team’s (humanmock attackers) analysis.

Articles

WINTER 2012 97

its patrolling resources, similar to previous appli-cations (Jain et al. 2010, Pita et al. 2011), PROTECTuses an attacker-defender Stackelberg game frame-work, with USCG as the defender against terroristadversaries that conduct surveillance before poten-tially launching an attack. PROTECT’s solution isto provide a mixed strategy, that is, randomizedpatrol patterns taking into account the importanceof different targets, and the adversary’s surveil-lance and anticipated reaction to USCG patrols.

While PROTECT builds on previous work, thisarticle highlights five key innovations. The firstand most important is PROTECT’s departure fromthe assumption of perfect rationality (which wasembedded in previous applications) on the part ofthe human adversaries. The assumption of perfectrationality is well recognized as a limitation of clas-sical game theory, and a number of approacheshave been proposed to model bounded rationalityin behavioral game-theoretic approaches (Camerer2003). Within this behavioral framework, quantalresponse equilibrium has emerged as a promisingapproach to model human bounded rationality(Camerer 2003; McKelvey and Palfrey 1995;Wright and Leyton-Brown 2010) including recent

results illustrating the benefits of the quantalresponse (QR) model in security games contexts(Yang et al. 2011). Therefore, PROTECT uses a nov-el algorithm called PASAQ (Yang, Tambe, andOrdóñez 2012) based on the QR model of a humanadversary. To the best of our knowledge, this is thefirst time that the QR model has been used in areal-world security application.

Second, PROTECT improves PASAQ’s efficiencythrough a compact representation of defenderstrategies exploiting dominance and equivalenceanalysis. Experimental results show the significantbenefits of this compact representation. Third,PROTECT addresses practical concerns (such asgrouping patrol points into areas) of modelingreal-world maritime patrolling application in aStackelberg framework. Fourth, this article presentsa detailed simulation analysis of PROTECT’srobustness to uncertainty that may arise in the realworld. For various cases of added uncertainty, thearticle shows that PROTECT’s quantal-response-based approach leads to significantly improvedrobustness when compared to an approach thatassumes full attacker rationality.

PROTECT has been in use at the Port of Boston

Figure 1. USCG Boats Patrolling the Ports of Boston and New York.

since April 2011 and been evaluated by the USCG.This evaluation brings forth our final key contri-bution: for the first time, this article provides real-world data comparing human-generated andgame-theoretic schedules. We also provide resultsfrom an Adversarial Perspective Team’s (APT)analysis and comparison of patrols before and afterthe use of the PROTECT system from a viewpointof an attacker. Given the success of PROTECT inBoston, PROTECT is currently being tested in thePort of New York (figure 1), and based on the out-come there, it may potentially be extended to oth-er ports in the United States.

USCG and PROTECT’s GoalsThe USCG continues to face challenges frompotential terrorists within the maritime environ-ment, which includes both the Maritime GlobalCommons and the ports and waterways that makeup the United States Maritime Transportation Sys-tem. The former Director of National Intelligence,Dennis Blair noted in 2010 a persistent threat“from al-Qa’ida and potentially others who shareits anti-Western ideology. A major terrorist attackmay emanate from either outside or inside theUnited States” (Blair 2010). This threat was rein-forced in May of 2011 following the raid on OsamaBin Laden’s home, where a large trove of materialwas uncovered, including plans to attack an oiltanker. “There is an indication of intent, with oper-atives seeking the size and construction of tankers,and concluding it’s best to blow them up from theinside because of the strength of their hulls” (Dozi-er 2011). These oil tankers transit the U.S. Mar-itime Transportation System. The USCG plays akey role in the security of this system and the pro-tection of seaports to support the economy, envi-ronment, and way of life in the United States(Young and Orchard 2011).

Coupled with challenging economic times,USCG must operate as effectively as possible,achieving maximum benefit from every hour spenton patrol. To that end, the goal of PROTECT is touse game theory to assist the USCG in maximizingits effectiveness in the Ports, Waterways, andCoastal Security (PWCS) Mission. The PROTECTsystem, focused on the PWCS patrols, addresseshow the USCG should optimally patrol criticalinfrastructure in a port to maximize protection,knowing that the adversary may conduct surveil-lance and then launch an attack.

PWCS patrols are focused on protecting criticalinfrastructure; without the resources to provide100 percent on-scene presence at any, let alone allof the critical infrastructure, optimized use of secu-rity resources is critical. Toward that end, unpre-dictability creates situations of uncertainty for anenemy and can be enough to deem a target less

appealing. While randomizing patrol patterns iskey, PROTECT also addresses the fact that the tar-gets are of unequal value, understanding that theadversary will adapt to whatever patrol patternsUSCG conducts. The output of PROTECT is aschedule of patrols that includes when the patrolsare to begin, what critical infrastructure to visit foreach patrol, and what activities to perform at eachcritical infrastructure. While initially pilot tested inthe Port of Boston, the solution technique isintended to be generalizable and applicable to oth-er ports.

Key Innovations in PROTECTStackelberg games have been well established inthe literature (Conitzer and Sandholm 2006;Korzhyk, Conitzer, and Parr 2011; Fudenberg andTirole 1991) and PROTECT models the PWCSpatrol problem as a Stackelberg game with USCG asthe leader (defender) and the terrorist adversariesin the role of the follower. The choice of thisframework was also supported by prior successfulapplications of Stackelberg games (Jain et al. 2010,Paruchuri et al. 2008, Pita et al. 2011). (For thoseunfamiliar with Stackelberg games, the sidebar arti-cle provides an introduction.)

In this Stackelberg game framework, the defend-er commits to a mixed (randomized) strategy ofpatrols, which is known to the attacker. This is areasonable approximation of the practice since theattacker conducts surveillance to learn the mixedstrategies that the defender carries out, andresponds with a pure strategy of an attack on a tar-get. The optimization objective is to find the opti-mal mixed strategy for the defender.

In the rest of this section, we begin by dis-cussing how to practically cast this real-world mar-itime patrolling problem of PWCS patrols as aStackelberg game. We also show how to reduce thenumber of defender strategies before addressingthe most important innovation in PROTECT: itsuse of the quantal response model.

Game ModelingTo model the USCG patrolling domain as a Stack-elberg game, we need to define the set of attackerstrategies, the set of defender strategies, and thepayoff function. These strategies and payoffs cen-ter on the targets in a port — ports, such as the Portof Boston, have a significant number of potentialtargets (critical infrastructure). As discussed above,our Stackelberg game formulation assumes that theattacker learns the defender’s strategy throughconducting surveillance, and can then launch anattack. Thus, the attacks an attacker can launch ondifferent possible targets are considered as his purestrategies.

Articles

98 AI MAGAZINE

Articles

WINTER 2012 99

Stackelberg Game

A generic Stackelberg game has two players, a leader, and a fol-lower (Fudenberg and Tirole 1991). A leader commits to a strat-egy first, and then a follower optimizes her reward, consideringthe action chosen by the leader (von Stengel and Zamir 2004).The two players in a Stackelberg game need not represent indi-viduals, but could also be groups that cooperate to execute ajoint strategy, such as a police force or a terrorist organization.Each player has a set of possible pure strategies, or the actionsthat they can execute. A mixed strategy allows a player to play aprobability distribution over pure strategies. Payoffs for eachplayer are defined over all possible pure-strategy outcomes forboth the players. The payoff functions are extended to mixedstrategies by taking the expectation over pure-strategy outcomes.The follower can observe the leader’s strategy, and then act in away to optimize his own payoffs.

To see the advantage of being the leader in a Stackelberggame, consider the game with the payoff as shown in the side-bar table. The leader is the row player and the follower is the col-umn player. The only pure-strategy Nash equilibrium for thisgame is when the leader plays a and the follower plays c, whichgives the leader a payoff of 3; in fact, for the leader, playing b isstrictly dominated.

However, in the simultaneous game if the leader can committo playing b before the follower chooses his strategy, then theleader will obtain a payoff of 4, since the follower would thenplay d to ensure a higher payoff for himself. If the leader com-mits to a mixed strategy of playing a and b with equal (0.5) prob-ability, then the follower will play d, leading to a higher expect-ed payoff for the leader of 4.5. As we can see from this example,the equilibrium strategy in the Stackelberg game can be in factdifferent from the Nash equilibria.

Stackelberg games are used to model the attacker-defenderstrategic interaction in security domains and this class of Stack-elberg games (with certain restrictions on payoffs [Yin et al.2010]) is called Stackelberg security games. In the Stackelbergsecurity game framework, the security force (defender) is mod-eled as the leader and the terrorist adversary (attacker) is in therole of the follower. The defender commits to a mixed (random-ized) strategy, whereas the attacker conducts surveillance ofthese mixed strategies and responds with a pure strategy of anattack on a target. Thus, the Stackelberg game framework is anatural approximation of the real-world security scenarios. Incontrast, the surveillance activity of the attacker cannot be mod-eled in the simultaneous move games with the Nash equilibriumsolution concept. The objective is to find the optimal mixedstrategy for the defender. See Tambe (2011) for a more detailedintroduction to research on Stackelberg security games.

Payoff Table for Example Stackelberg Game.

c d

a 3, 1 5, 0

b 2, 0 4, 2

However, the definition of defender strategies isnot as straightforward. Patrols last for some fixedduration during the day as specified by USCG, forexample, 4 hours. Our first attempt was to modeleach target as a node in a graph and allow patrolpaths to go from each individual target to (almostall) other targets in the port, generating an almostcomplete graph on the targets. This method yieldsthe most flexible set of patrol routes that would fitwithin the maximum duration, covering any per-mutation of targets within a single patrol. Thismethod unfortunately faced significant challenges:it required determining the travel time for a patrolboat for each pair of targets, a daunting knowledgeacquisition task given the hundreds of pairs of tar-gets; it did not maximize the use of port geographywhereby boat crews could observe multiple targetsat once and it was perceived as micromanaging theactivities of the USCG boat crews, which was unde-sirable.

Our improved approach to generating defenderstrategies therefore grouped nearby targets intopatrol areas (in real-world scenarios such as thePort of Boston, some targets are very close to eachother and it is thus natural to group targets togeth-er according to their geographic locations). Thepresence of patrol areas led the USCG to redefinethe set of defensive activities to be performed onpatrol areas to provide a more accurate and expres-sive model of the patrols. Activities that take alonger time provide the defender a higher payoffcompared to activities that take a shorter time tocomplete. This affects the final patrol schedule asone patrol may visit fewer areas but conduct longerduration defensive activities at the areas, whileanother patrol may have more areas with shorterduration activities.

To generate all the permutations of patrolschedules, a graph G = (V, E) is created with thepatrol areas as vertices V and adjacent patrol areasas edges E. Using the graph of patrol areas, PRO-TECT generates all possible patrol schedules, eachof which is a closed walk of G that starts and endsat the patrol area b V, the base patrol area for theUSCG. Each patrol schedule is a sequence of patrolareas and associated defensive activities at eachpatrol area in the sequence, and schedules are con-strained by a maximum patrol time . (Note thateven when the defender just passes by a patrolarea, this is treated as an activity.) The defendermay visit a patrol area multiple times in a scheduledue to geographic constraints and the fact thateach patrol is a closed walk. For instance, thedefender in each patrol should visit the base patrolarea at least twice since she needs to start the patrolfrom the base and finally come back to the base tofinish the patrol.

The graph G along with the constraints b and are used to generate the defender strategies (patrol

schedules). Given each patrol schedule, the totalpatrol schedule time is calculated (this alsoincludes traversal time between areas, but weignore it in the following for expository purposes);we then verify that the total time is less than orequal to the maximum patrol time . After gener-ating all possible patrol schedules, a game isformed where the set of defender strategies is com-posed of patrol schedules and the set of attackerstrategies is the set of targets. The attacker’s strate-gy was based on targets instead of patrol areasbecause an attacker will choose to attack a singletarget.

Table 1 gives an example, where the rows corre-spond to the defender’s strategies and the columnscorrespond to the attacker’s strategies. In thisexample, there are two possible defensive activi-ties, activity k1 and k2, where k2 provides moreeffective protection (also takes more time) for thedefender than k1. Suppose that the time bound dis-allows more than two k2 activities (given the timerequired for k2) within a patrol. Patrol area 1 hastwo targets (target 1 and 2) while patrol areas 2 and3 each have one target (target 3 and 4 respectively).In the table, a patrol schedule is composed of asequence of patrol areas and a defensive activity ineach area. The patrol schedules are ordered so thatthe first patrol area in the schedule denotes whichpatrol area the defender needs to visit first. In thisexample, patrol area 1 is the base patrol area, andall of the patrol schedules begin and end at patrolarea 1. For example, the patrol schedule in row 2first visits patrol area 1 with activity k2, then trav-els to patrol area 2 with activity k1, and returnsback to patrol area 1 with activity k1. For the pay-offs, if a target i is the attacker’s choice and theattack fails, then the defender would gain a rewardRd

i while the attacker would receive a penalty Pai,

else the defender would receive a penalty Pdi and

the attacker would gain a reward Rai,. Furthermore,

let Gdij be the payoff for the defender if the defend-

er chooses patrol j and the attacker chooses toattack target i. Gd

ij can be represented as a linearcombination of the defender reward/penalty on

target i and Aij, the effectiveness probability of thedefensive activity performed on target i for patrolj, as described by equation 1. Aij depends on themost effective activity on target i in patrol j. Thevalue of Aij is 0 if target i is not in patrol j. If patrolj only includes one activity in a patrol area thatcovers target i, then we determine its payoff usingthe following equation (any additional activitymay provide an additional incremental benefit inthat area and we discuss this in the following sec-tion).

(1)

For instance, suppose target 1 is covered using k1in strategy 5, and the value of A15 is 0.5. If Rd

i = 150and Pd

1 = –50, then Gd15 = 0.5(150) + (1 – 0.5)(–50)

= 50. Gaij would be computed in a similar fashion.

In the USCG problem, rewards and penalties arebased on an analysis completed by a contractedcompany of risk analysts that looked at the targetsin the Port of Boston and assigned correspondingvalues for each one. The types of factors taken intoconsideration for generating these values includeeconomic damage and injury/loss of life. Mean-while, the effectiveness probability, Aij, for differ-ent defensive activities are decided based on theduration of the activities. Longer activities lead toa higher possibility of capturing the attackers.

While loss of life and property helps in assess-ing damage in case of a successful attack, assessingpayoffs requires that we determine whether theloss is viewed symmetrically by the defender andattacker. Similarly, whether the payoffs are viewedsymmetrically for the attacker and defender alsoholds for the scenario when there is a failed attack.These questions to go the heart of determiningwhether security games should be modeled as zero-sum games (Tambe 2011).

Past work in security games (for example,ARMOR [Jain et al. 2010], IRIS [Jain et al. 2010],GUARDS [Pita et al. 2011]) has used nonzero-sumgame models; for example, one assumption madeis that the attacker might view publicity of a failedattack as a positive outcome. However, nonzero-sum games require further knowledge acquisitionefforts to model the asymmetry in payoffs. For sim-plicity, as the first step PROTECT starts with theassumption of a zero-sum game. However, thealgorithm used in PROTECT is not restricted tozero-sum games, as in the future, USCG proposesto relax this assumption. It is also important tonote that while table 1 shows point estimates ofpayoffs, we recognize that estimates may not beaccurate. To that end, in the experimental resultssection, we evaluated the robustness of ourapproach when there is payoff noise, observationnoise, and execution error.

Gijd = AijRi

d + (1!Aij )Pid

Articles

100 AI MAGAZINE

Patrol Schedule Target 1 Target 2 Target 3 Target 4

(1:k1), (2: k1), (1: k1) 50, –50 … 15, –15 …

(1: k2), (2: k1), (1: k1) … 60,-60 … …

(1: k1), (2: k1), (1: k2) … … … -20, 20

(1: k1), (3: k1), (2: k1), (1: k1) 50, –50 … … …

(1: k1), (2: k1), (3: k1), (1: k1) … … 15, –15 …

Table 1. Portion of a Simplified Example of a Game Matrix.

Articles

WINTER 2012 101

ner: we start with patrols that visit the most patrolareas with the least effective activities within thepatrol time limit; these activities take a shorteramount of time but we can cover more areas with-in the given time limit. Then we gradually consid-er patrols visiting less patrol areas but with increas-ingly effective activities. This process will stopwhen we have considered all patrols in which allpatrol areas are covered with the most effectiveactivities and cannot include any additional patrolarea.

Figure 2 shows a high level view of the steps ofthe algorithm using the compact representation.The compact strategies are used instead of fullpatrol schedules to generate the game matrix.Once the optimal probability distribution is calcu-lated for the compact strategies, the strategies witha probability greater than 0 are expanded to a com-plete set of patrol schedules.

In this expansion from a compact strategy to afull set of patrol schedules, we need to determinethe probability of choosing each patrol schedule,since a compact strategy may correspond to multi-ple patrol schedules. The focus here is to increasethe difficulty for the attacker to conduct surveil-lance by increasing unpredictability,1 which weachieve by randomizing uniformly over all expan-sions of the compact defender strategies. The uni-form distribution provides the maximum entropy(greatest unpredictability). Thus, all the patrolschedules generated from a single compact strate-gy are assigned a probability of vi / wi where vi is theprobability of choosing a compact strategy i andwi is the total number of expanded patrol sched-ules for i. The complete set of patrol schedulesand the associated probabilities are then sampledand provided to the USCG, along with the starttime of the patrol generated through uniform ran-dom sampling.

Human Adversary ModelingWhile previous game-theoretic security applica-tions have assumed a perfectly rational attacker,PROTECT takes a step forward by addressing thislimitation of classical game theory. Instead, PRO-TECT uses a model of a boundedly rational adver-sary by using a quantal response (QR) model of anadversary, which has shown to be a promising

Compact RepresentationIn our game, the number of defender strategies,that is, patrol schedules, grows combinatorially,generating a scale-up challenge. To achieve scale-up, PROTECT uses a compact representation of thepatrol schedules using two ideas: combining equiv-alent patrol schedules and removal of dominatedpatrol schedules.

With respect to equivalence, different permuta-tions of patrol schedules provide identical payoffresults. Furthermore, if an area is visited multipletimes with different activities in a schedule, weonly consider the activity that provides thedefender the highest payoff, not the incrementalbenefit due to additional activities. This decision ismade in consideration of the trade-off betweenmodeling accuracy and efficiency. On the onehand, the additional value of more activities issmall. Currently, the patrol time of each scheduleis relatively short (for example, 1 hour) and thedefender may visit a patrol area more than oncewithin the short period and will conduct an activ-ity each time. For instance, the defender may passby a patrol area 10 minutes after conducting amore effective activity at the same patrol area. Theadditional value of the pass by activity given themore effective activity is therefore very small. Onthe other hand, it leads to significant computa-tional benefits, which are described in this section,if we just consider the most effective activity ineach patrol.

Therefore, many patrol schedules are equivalentif the set of patrol areas visited and the most effec-tive defensive activities in each patrol area in theschedules are the same even if their order differs.Such equivalent patrol schedules are combinedinto a single compact defender strategy, represent-ed as a set of patrol areas and defensive activities(and minus any ordering information). The idea ofcombining equivalent actions is similar to actionabstraction for solving large-scale dynamic games(Gilpin 2009). Table 2 presents a compact versionof table 1, which shows how the game matrix issimplified by using equivalence to form compactdefender strategies, for example, the patrol sched-ules in rows 2 and 3 from table 1 are represented asa compact strategy 2 = {(1, k2), (2, k1)} in table 2.

Next, the idea of dominance is illustrated usingtable 2 and noting the difference between 1 and2 is the defensive activity on patrol area 1. Sinceactivity k2 gives the defender a higher payoff thank1, 1 can be removed from the set of defenderstrategies because 2 covers the same patrol areaswhile giving a higher payoff for patrol area 1. Togenerate the set of compact defender strategies, anaive approach would be first to generate the fullset of patrol schedules and then prune the domi-nated and equivalent schedules. Instead, PROTECTgenerates compact strategies in the following man-

Compact Strategy Target 1 Target 2 Target 3 Target 4

1 = {(1: k1), (2: k1)} 50, –50 30, –30 15, –15 –20, 20

2 = {(1: k2), (2: k1)} 100, –100 60, –60 15, –15 –20, 20

3 = {(1: k1), (2: k1), (3: k1)} 50, –50 30, –30 15, –15 10, –10

Table 2. Example Compact Atrategies and Fame Matrix.

model of human decision making (McKelvey andPalfrey 1995; Rogers, Palfrey, and Camerer 2009;Yang et al. 2011). A recent study demonstrated theuse of QR as an effective prediction model ofhumans (Wright and Leyton-Brown 2010). Aneven more relevant study of the QR model wasconducted by Yang et al. (Yang et al. 2011) in thecontext of security games where this model wasshown to outperform competitors in modelinghuman subjects. Based on this evidence, PROTECTuses a QR model of a human adversary, that is, inthe Stackelberg game model, the attacker best-responds according to a QR model and the defend-er computes her optimal mixed patrolling strategywith this knowledge.

To apply the QR model in a Stackelberg frame-work, PROTECT employs an algorithm known asPASAQ (Yang, Tambe, and Ordóñez 2012). PASAQcomputes the optimal defender strategy (within aguaranteed error bound) given a QR model of the

adversary by solving the following nonlinear andnonconvex optimization problem P, with table 3listing the notation.

The first line of the problem corresponds to thecomputation of the defender’s expected utilityresulting from a combination of equations 1 and 2(in the sidebar). QR(i|l) is the probability that theattacker using the QR model will attack target i.Unlike in previous applications (Jain et al. 2010;Kiekintveld, Marecki, and Tambe 2011; Paruchuriet al. 2008), xi in this case summarizes not justpresence or absence on a target, but also the effec-tiveness probability Aij on the target as well. Thatis, the second line computes the marginal cover-age on the targets based on the effectiveness factorAij and the probability of choosing compact strat-egy j, denoted as aj.

As with all QR models, a value for l is needed torepresent the noise in the attacker’s strategy. Clear-ly, a l value of 0 (uniform random) and ∞ (fullyrational) are not reasonable. Given the payoff datafor Boston, an attacker’s strategy with l = 4 startsapproaching a fully rational attacker — the proba-bility of attack focuses on a single target. In addi-tion, an attacker’s strategy with l = 0.5 is similar toa fully random strategy that uniformly chooses a

P :

maxa Qi=1

T

! R(i | ")((Rid #Pi

d )xi + Pid )

xi = ajj=1

J

! Aij ,$i

ajj=1

J

! =1

0% aj %1,$j

&

'

((((((((((((

)

((((((((((((

Articles

102 AI MAGAZINE

Graph CompactStrategies Game Matrix

ProbabilityDistribution over

Compact Strategies

Final PatrolSchedule for

USCG

Sampling

PASAQ

Expand CompactStrategies withprobability > 0

Figure 2. Flow Chart of the PROTECT System.

ti Target i

Rdi Defender reward on covering ti if it’s attacked

Pdi Defender penalty on not covering ti if it’s attack

Rai Attacker reward on attacking ti if it’s not covered

Pai Attacker penalty on attacking ti if it’s covered

Aij Effectiveness probability of compact strategy j on ti

aj Probability of choosing compact strategy j

J Total number of compact strategies

xi Marginal coverage on ti

Table 3. PASAQ Notation as Applied to PROTECT.

Articles

WINTER 2012 103

Quantal Response

Quantal response equilibrium is an important model in behav-ior game theory that has received widespread support in the lit-erature in terms of its superior ability to model human behav-ior in simultaneous-move games (McKelvey and Palfrey 1995)and is the baseline model of many studies (Wright and Leyton-Brown 2010). It suggests that instead of strictly maximizingutility, individuals respond stochastically in games: the chanceof selecting a nonoptimal strategy increases as the cost of suchan error decreases. However, the applicability of these modelsto Stackelberg security games had not been explored previous-ly. In Stackelberg security games, we assume that the attackeracts with bounded rationality; the defender is assisted by soft-ware and thus we compute the defender’s optimal rationalstrategy (Yang et al. 2011).

Given the strategy of the defender, the quantal best responseof the attacker is defined as

(2)

The parameter l represents the amount of noise in theattacker’s strategy. l can range from 0 to ∞ with a value of 0 rep-resenting a uniform random probability over attacker strategiesand with a value of ∞ representing a perfectly rational attack-er. qi corresponds to the probability that the attacker chooses atarget i; Ga

i (xi) corresponds to the attacker’s expected utility ofattacking target i given xi, the probability that the defendercovers target i; and T is the total number of targets.

Consider the Stackelberg game with the payoffs shown intable 4. Assume that the leader commits to playing b. The fol-lower obtains a payoff of 0 by playing c and obtains a payoff of2 by playing d. A rational attacker will play d to maximize hispayoff. However, the quantal best response of the attackerwould be playing c with probability

and playing d with probability

.

qi =e!Gi

a (xi )

e!Gja (xi )

j=1

T

"

e0

e0 + e2!

e2!

e0 + e2!

target to attack. USCG experts (with expertise interrorist behavior modeling) suggested that wecould use a broad range for representing possible lvalues used by the attacker. Combining the aboveobservations, it was determined that the attacker’sstrategy is best modeled with a l value that is inthe range [0.5, 4], rather than a single point esti-mate. A discrete sampling approach was used todetermine a l value that gives the highest averagedefender expected utility across attacker strategieswithin this range to get l = 1.5. Specifically, thedefender considers different assumptions of theattacker’s l value and for each assumption aboutthe l value, the defender computes her expectedutility against the attacker with different l valueswithin the range [0.5, 4]. We find that when thedefender assumes the attacker is using the QRmodel with l = 1.5, the defender’s strategy leads tothe highest defender expected utility when theattacker follows the QR model with a l value uni-formly randomly chosen from the range of [0.5, 4].Selecting an appropriate value for l remains a com-plex issue however, and it is a key agenda item forfuture work.

EvaluationThis section presents evaluations based on experi-ments completed through simulations and real-world patrol data along with USCG analysis. Allscenarios and experiments, including the payoffvalues and graph (composed of nine patrol areas),were based off the Port of Boston. The defender’spayoff values have a range of [–10,5] while theattacker’s payoff values have a range of [–5,10]. Thegame was modeled as a zero-sum game in whichthe attacker’s loss or gain is balanced precisely bythe defender’s gain or loss. For PASAQ, the defend-er’s strategy is computed assuming that the attack-er follows the QR model with l = 1.5 as justified inthe previous section. All experiments are run on amachine with an Intel Dual Core 1.4 GHz proces-sor and 2 GB of RAM.

Memory and Run-Time AnalysisThis section presents the results based on simula-tion to show the efficiency in memory and runtime of the compact representation versus the fullrepresentation. In figure 3a, the x-axis is the maxi-mum patrol time allowed and the y-axis is thememory needed to run PROTECT. In figure 3b, thex-axis is the maximum patrol time allowed and they-axis is the run time of PROTECT. The maximumpatrol time allowed determines the number ofcombinations of patrol areas that can be visited —so the x-axis indicates a scale-up in the number ofdefender strategies. When the maximum patroltime is set to 90 minutes, the full representationtakes 30 seconds and uses 540 MB of memory

while the compact representation takes 11 secondsto run and requires 20 MB of memory. Due to theexponential increase in the memory and run timethat is needed for the full representation, it cannotbe scaled up beyond 90 minutes.

Utility AnalysisIt is useful to understand whether PROTECT usingPASAQ with l = 1.5 provides an advantage whencompared to: (1) a uniform random defender’sstrategy; (2) a mixed strategy with the assumptionof the attacker attacking any target uniformly atrandom (l = 0) or; (3) a mixed strategy assuming afully rational attacker (l = ∞). The previously exist-ing DOBSS algorithm was used for l = ∞ (Paruchuriet al. 2008). Additionally, comparison with the l =∞ approach is important because of the extensive

use of this assumption in previous applications (forour zero-sum case, DOBSS is equivalent to mini-max but the utility does not change). Typically, wemay not have an estimate of the exact value of theattacker’s l value, only a possible range. Therefore,ideally we would wish to show that PROTECT(using l = 1.5 in computing the optimal defenderstrategy) provides an advantage over a range of lvalues assumed for the attacker (not just over apoint estimate) in his best response, justifying ouruse of the PASAQ algorithm. In other words, we aredistinguishing between (1) the actual l valueemployed by the attacker in best-responding, and(2) the l assumed by PASAQ in computing thedefender’s optimal mixed strategy. The point is tosee how sensitive the choice of (2) is, with respectto prevailing uncertainty about (1).

To achieve this, we compute the average defend-er utility of the four approaches above as the l val-ue of the attacker’s strategy changes from [0, 6],which subsumes the range [0.5, 4] of reasonableattacker strategies. In figure 4, the y-axis representsthe defender’s expected utility and the x-axis is thel value that is used for the attacker’s strategy. Bothuniform random strategies perform well when theattacker’s strategy is based on l = 0. However, as lincreases, both strategies quickly drop to a verylow defender expected utility. In contrast, thePASAQ strategy with l = 1.5 provides a higherexpected utility than that assuming a fully ration-al attacker over a range of attacker l values (andindeed over the range of interest), not just at l =1.5.

Robustness AnalysisIn the real world, observation, execution, and pay-offs are not always perfect due to the following:noise in the attacker’s surveillance of the defend-er’s patrols, the many tasks and responsibilities ofthe USCG where the crew may be pulled off apatrol, and limited knowledge of the attacker’spayoff values. Our hypothesis is that PASAQ withl = 1.5 is more robust to such noise than a defend-er strategy that assumes full rationality of theattacker such as DOBSS (Paruchuri et al. 2008), thatis, PASAQ’s expected defender utility will notdegrade as much as DOBSS over the range ofattacker l of interest. This is illustrated by compar-ing both PASAQ and DOBSS against observation,execution, and payoff noise (Kiekintveld, Marecki,and Tambe 2011; Korzhyk, Conitzer, and Parr2011; Yin et al. 2011). Intuitively, the QR model ismore robust than models assuming perfect ration-ality since the QR model assumes that the attackermay attack multiple targets with positive probabil-ities, rather than attacking a single target in themodel assuming perfect rationality of the adver-saries. Such intuition has been verified in othercontexts (Rogers, Palfrey, and Camerer 2009). All

Articles

104 AI MAGAZINE

600

100

200

300

400

500

600

70 80 90 100

Mem

ory

(MB)

Max Patrol Time (minutes)

AFull

Compact

Runt

ime

(sec

ond

s)

Max Patrol Time (minutes)60

0

10

5

15

20

25

30

35

70 80 90

Full

Compact

B

Figure 3. Comparison of Full Versus Compact Representation.

Articles

WINTER 2012 105

Def

ende

r Ex

pec

ted

Util

ityAttacker value

0 0.5 1-2

-1.5

-1

-0.5

0

0.5

1.5 2 2.5 3 3.5 4 4.5 5 5.5 6

PASAQ ( =1.5)

DOBSSUniform Random(Attacker)Uniform Random(Defender)

Figure 4. Defender’s Expected Utility When Varying l for Attacker’s Strategy(Color).

experiments were run generating 200 samples withadded noise and averaging over all the samples.

Figure 5 shows the performance of differentstrategies while considering execution noise. They-axis represents the defender’s expected utilityand the x-axis is the attacker’s l value. If thedefender covered a target with probability p, thisprobability now changes to be in [p – x, p + x]where x is the noise. The low execution error cor-responds to x = 0.1 whereas high error correspondsto x = 0.2. In the experiments, the attacker best-responds to the mixed strategy with added noise.The key takeaway here is that execution error leadsto PASAQ dominating DOBSS over all tested valuesof l, further strengthening the reason to usePASAQ rather than a full-rationality model. Forboth algorithms, the defender’s expected utilitydecreases as more execution error is added becausethe defender’s strategy is affected by the addition-al error. When execution error is added, PASAQdominates DOBSS because the latter seeks to max-imize the minimum defender’s expected utility somultiple targets will have the same minimumdefender utility. For DOBSS, when execution erroris added, there is a greater probability that one ofthese targets will have less coverage, resulting in alower defender’s expected utility. For PASAQ, typi-cally only one target has the minimum defenderexpected utility. As a result changes in coverage donot affect it as much as DOBSS. As execution errorincreases, the advantage in the defender’s expect-ed utility of PASAQ over DOBSS increases evenmore. This section only shows the execution noiseresults; the details of the observation and payoffnoise results can be found in Shieh et al. (2012).

USCG Real-World EvaluationIn addition to the data made available from simu-lations, the USCG conducted its own real-worldevaluation of PROTECT. With permission, someaspects of the evaluation are presented in this arti-cle.

Real-World Scheduling DataUnlike prior publications of real-world applica-tions of game theory for security, a key novelty ofthis article is the inclusion of actual data fromUSCG patrols before and after the deployment ofPROTECT at the Port of Boston. Figure 6 and figure7 show the frequency of visits by USCG to differentpatrol areas over a number of weeks. Figure 6shows pre-PROTECT patrol visits per day by areaand figure 7 shows post-PROTECT patrol visits perday by area. The x-axis is the day of the week, andthe y-axis is the number of times a patrol area isvisited for a given day of the week. The y-axis isintentionally blurred for security reasons as this isreal data from Boston. There are more lines in fig-ure 6 than in figure 7 because during the imple-mentation of PROTECT, new patrol areas were

formed that contained more targets and thus few-er patrol areas in the post-PROTECT figure. Figure6 depicts a definite pattern in the patrols. Whilethere is a spike in patrols executed on Day 5, thereis a dearth of patrols on Day 2. Besides this pattern,the lines in figure 6 intersect, indicating that somedays a higher-value target was visited more oftenwhile on other days it was visited less often. Thismeans that there was not a consistently high fre-quency of coverage of higher-value targets beforePROTECT.

In figure 7, we notice that the pattern of lowpatrols on day 2 (from figure 6) disappears. Fur-thermore, lines do not frequently intersect, that is,higher-valued targets are visited consistently acrossthe week. The top line in figure 7 is the base patrolarea and is visited at a higher rate than all otherpatrol areas.

Adversary Perspective Teams(APT)To obtain a better understanding of how the adver-sary views the potential targets in the port, theUSCG created the Adversarial Perspective Team(APT), a mock attacker team. The APT providesassessments from the terrorist perspective and as asecondary function, assesses the effectiveness ofthe patrol activities before and after deployment ofPROTECT. In their evaluation, the APT incorpo-rates the adversary’s known intent, capabilities,skills, commitment, resources, and cultural influ-

ences. In addition, it screens attack possibilitiesand assists in identifying the level of deterrenceprojected at and perceived by the adversary. Forthe purposes of this research, the adversary is

defined as an individual(s) with ties to al-Qa’ida orits affiliates.

The APT conducted a pre- and post-PROTECTassessment of the system’s impact on an adver-

Articles

106 AI MAGAZINE

PASAQ( =1.5)

DOBSS

PASAQ(error low)

DOBSS (error low)

PASAQ(error high)

DOBSS (error high)D

efen

der’s

Exp

ecte

d U

tility

Attacker value 0 0.5 1

-3

-2

-2.5

-1.5

-1

-0.5

0

1.5 2 2.5 3 3.5 4 4.5 5 5.5 6

Figure 5. Defender’s Expected Utility: Execution Noise(Color).

0Day 1 Day 2 Day 3 Day 4 Day 5 Day 6 Day 7

5

10

15

20

25

30

Cou

nt

Figure 6. Patrol Visits per Day by Area — Pre-PROTECT(Color), One Line Per Area.

Articles

WINTER 2012 107

sary’s deterrence at the Port of Boston. This analy-sis uncovered a positive trend where the effective-ness of deterrence increased from the pre- to post-PROTECT observations.

Additional Real-World IndicatorsThe use of PROTECT and APT’s improved guidancegiven to boat crews on how to conduct the patroljointly provided a noticeable increase in the qual-ity and effectiveness of the patrols. Prior to imple-menting PROTECT, there were no documentedreports of illicit activity. After implementation,USCG crews reported more illicit activities withinthe port (therefore justifying the effectiveness ofthe PROTECT model) and provided a noticeable“on the water” presence, with industry port part-ners commenting, “the Coast Guard seems to beeverywhere, all the time.” With no actual increasein the number of resources applied, and thereforeno increase in capital or operating costs, these out-comes support the practical application of gametheory in the maritime security environment.

Outcomes After Boston ImplementationAfter evaluating the performance and impact ofPROTECT at Boston, the USCG viewed this systemas a success. As a result, PROTECT is now gettingdeployed in the Port of New York. We were pre-sented an award for the work on the PROTECT sys-tem for the Boston Harbor, which reflects USCG’srecognition of the impact and value of PROTECT.

Lessons Learned: Putting Theory into Practice

Developing the PROTECT model was a collabora-tive effort involving university researchers andUSCG personnel representing decision makers,planners, and operators. Building on the lessonsreported in Pita et al. (2011) for working with secu-rity organizations, we informed the USCG of (1)the assumptions underlying the game-theoreticapproaches, for example, full adversary rationality,and strengths and limitations of different algo-rithms — rather than preselecting a simple heuris-tic approach; (2) the need to define and collect cor-rect inputs for model development and; (3) afundamental understanding of how the inputsaffect the results. We gained three new insightsinvolving real-world applied research; unforeseenpositive benefits because security agencies werecompelled to reexamine their assumptions;requirement to work with multiple teams in a secu-rity organization at multiple levels of the hierar-chy; and the need to prepare answers to end-userpractical questions not always directly related tothe “meaty” research problems.

The first insight came about when USCG wascompelled to reassess its operational assumptionsas a result of working through the research prob-lem. A positive result of this reexamination

0Day 1 Day 2 Day 3 Day 4 Day 5 Day 6 Day 7

20

40

60

80

100

120

140

Cou

ntBase Patrol Area

Figure 7. Patrol Visits per Day by Area — Post-PROTECT(Color), One Line Per Area.

prompted USCG to develop new PWCS missiontactics, techniques, and procedures. Through theiterative development process, USCG reassessedthe reasons why boat crews performed certainactivities and whether they were sufficient. Forexample, instead of “covered” versus “not cov-ered” as the only two possibilities at a patrol point,there are now multiple sets of activities at eachpatrol point.

The second insight is that applied researchrequires the research team to collaborate with plan-ners and operators on the multiple levels of a securi-ty organization to ensure the model accounts for allaspects of a complex real-world environment. Ini-tially when we started working on PROTECT, thefocus was on patrolling each individual target. Thisappeared to micromanage the activities of boatcrews, and it was through their input that individualtargets were grouped into patrol areas associatedwith a PWCS patrol. Input from USCG headquartersand the APT mentioned earlier led to other changesin PROTECT, for example, departing from a fullyrational model of an adversary to a QR model.

The third insight is the need to develop answersto end-user questions that are not always related tothe “meaty” research question but are related tothe larger knowledge domain on which theresearch depends. One example of the need toexplain results involved the user citing that onepatrol area was being repeated and hence, ran-domization did not seem to occur. After assessingthis concern, we determined that the cause for therepeated visits to a patrol area was its high reward— order of magnitude greater than the rarely visit-ed patrol areas. PROTECT correctly assigned patrolschedules that covered the more “important”patrol areas more frequently. In another example,the user noted that PROTECT did not assign anypatrols to start at 4:00 AM or 4:00 PM over a 60-daytest period. They expected patrols would be sched-uled to start at any hour of the day, leading themto ask if there was a problem with the program.This required us to develop a layman’s briefing onprobabilities, randomness, and sampling. With 60patrol schedules, a few start hours may not be cho-sen given our uniform random sampling of thestart time. These practitioner-based issues demon-strate the need for researchers to not only be con-versant in the algorithms and math behind theresearch, but also be able to explain from a user’sperspective how solutions are accurate. An inabili-ty to address these issues would result in a lack ofreal-world user confidence in the model.

Summary and Related WorkThis article reports on PROTECT, a game-theoreticsystem deployed by the USCG in the Port ofBoston since April 2011 for scheduling their

patrols. USCG has deemed the deployment ofPROTECT in Boston a success and efforts areunderway to deploy PROTECT in the Port of NewYork, and to other ports in the United States. PRO-TECT uses an attacker-defender Stackelberg gamemodel, and includes five key innovations.

First, PROTECT moves away from the assump-tion of perfect adversary rationality seen in previ-ous work, relying instead on a quantal response(QR) model of the adversary’s behavior. While theQR model has been extensively studied in therealm of behavioral game theory, to the best of ourknowledge, this is its first real-world deployment.Second, to improve PROTECT’s efficiency, we gen-erate a novel compact representation of thedefender’s strategy space, exploiting equivalenceand dominance. Third, the article shows how topractically model a real-world (maritime)patrolling problem as a Stackelberg game. Fourth,we provide experimental results illustrating thatPROTECT’s QR model of the adversary is betterable to handle real-world uncertainties than a per-fect rationality model. Finally, for the first time ina security application evaluation, we use real-worlddata: providing a comparison of human-generatedsecurity schedules versus those generated througha game-theoretic algorithm and results from anAPT’s analysis of the impact of the PROTECT sys-tem. The article also outlined the insights from theproject, which include the ancillary benefits due toa review of assumptions made by security agencies,and the need for knowledge to answer questionsnot directly related to the research problem.

As a result, PROTECT has advanced the state ofthe art beyond previous applications of game the-ory for security. Prior applications mentioned ear-lier, including ARMOR, IRIS, or GUARDS (Jain et al.2010, Pita et al. 2011), have each provided uniquecontributions in applying novel game-theoreticalgorithms and techniques. Interestingly, theseapplications have revolved around airport and air-transportation security. PROTECT’s novelty is notonly its application domain in maritimepatrolling, but also in the five key innovationsmentioned above, particularly its emphasis onmoving away from the assumption of perfectrationality by using the QR model.

In addition to game-theoretic applications, theissue of patrolling has received significant atten-tion in the multiagent systems literature. Theseinclude patrol work done by robots primarily forperimeter patrols that have been addressed in arbi-trary topologies (Basilico, Gatti, and Amigoni2009), maritime patrols in simulations for deter-ring pirate attacks (Vanek et al. 2011), and inresearch looking at the impact of uncertainty inadversarial behavior (Agmon et al. 2009). PRO-TECT differs from these approaches in its use of aQR model of a human adversary in a game-theo-

Articles

108 AI MAGAZINE

Articles

WINTER 2012 109

retic setting, and in being a deployed application.Building on this initial success of PROTECT, wehope to deploy it at more and much larger-sizedports. In so doing, in the future, we will considersignificantly more complex attacker strategies,including potential real-time surveillance andcoordinated attacks.

AcknowledgmentsWe thank the USCG offices, and particularly sectorBoston, for their exceptional collaboration. Theviews expressed herein are those of the authorsand are not to be construed as official or reflectingthe views of the Commandant or of the U.S. CoastGuard. This research was supported by the UnitedStates Department of Homeland Security throughthe National Center for Risk and Economic Analy-sis of Terrorism Events (CREATE) under awardnumber 2010-ST-061-RE0001. This article extendsand modifies our previous conference publication(Shieh et al. 2012).

Note1. Creating optimal Stackelberg defender strategies thatincrease the attacker’s difficulty of surveillance is an openresearch issue in the literature; here we choose to maxi-mize unpredictability as the first step.

ReferencesAgmon, N.; Kraus, S.; Kaminka, G. A.; and Sadov, V. 2009.Adversarial Uncertainty in Multi-Robot Patrol. In Pro-ceedings of the 21st International Jont Conference on ArtificalIntelligence, 1811–1817. Menlo Park, CA: AAAI Press.

Basilico, N.; Gatti, N.; and Amigoni, F. 2009. Leader-Fol-lower Strategies for Robotic Patrolling in Environmentswith Arbitrary Topologies. In Proceedings of the 8th Inter-national Conference on Autonomous Agents and MultiagentSystems, 500–503. Richland, SC: International Founda-tion for Autonomous Agents and Multiagent Systems.

Blair, D. 2010. Annual Threat Assessment of the US Intel-ligence Community for the Senate Select Committee onIntelligence. Washington, DC: Senate Select Committeeon Intelligence (www.dni.gov/testimonies/20100202 tes-timony.pdf).

Camerer, C. F. 2003. Behavioral Game Theory: Experimentsin Strategic Interaction. Princeton, NJ: Princeton Universi-ty Press.

Conitzer, V., and Sandholm, T. 2006. Computing theOptimal Strategy to Commit To. In Proceedings of the 7thACM Conference on Electronic Commerce, 82–90. New York:Association for Computing Machinery.

Dozier, K. 2011. Bin Laden Trove of Documents SharpenUS Aim. New York: NBC News (www.msnbc.msn.com/id/43331634/ns/us news-security/t/bin- laden-trove-doc-uments-sharpen-us-aim).

Fudenberg, D., and Tirole, J. 1991. Game Theory. Cam-bridge, MA: The MIT Press.

Gilpin, A. 2009. Algorithms for Abstracting and SolvingImperfect Information Games. Ph.D. Dissertation,Carnegie Mellon University, Pittsburgh, PA.

Jain, M.; Tsai, J.; Pita, J.; Kiekintveld, C.; Rathi, S.; Tambe,M.; and Ordóñez, F. 2010. Software Assistants for Ran-domized Patrol Planning for the LAX Airport Police andthe Federal Air Marshal Service. Interfaces 40(4): 267–290.

Kiekintveld, C.; Marecki, J.; and Tambe, M. 2011. Approx-imation Methods for Infinite Bayesian StackelbergGames: Modeling Distributional Uncertainty. In Proceed-ings of the 10th International Conference on AutonomousAgents and Multiagent Systems, 1005–1012. Richland, SC:International Foundation for Autonomous Agents andMultiagent Systems.

Korzhyk, D.; Conitzer, V.; and Parr, R. 2011. SolvingStackelberg games with Uncertain Observability. In Pro-ceedings of the 10th International Conference on AutonomousAgents and Multiagent Systems, 1013–1020. Richland, SC:International Foundation for Autonomous Agents andMultiagent Systems.

McKelvey, R. D., and Palfrey, T. R. 1995. QuantalResponse Equilibria for Normal Form Games. Games andEconomic Behavior 10(1): 6–38.

Paruchuri, P.; Pearce, J. P.; Marecki, J.; Tambe, M.;Ordóñez, F.; and Kraus, S. 2008. Playing Games withSecurity: An Efficient Exact Algorithm for Bayesian Stack-elberg games. In Proceedings of the 7th International Con-ference on Autonomous Agents and Multiagent Systems, 895–902. Richland, SC: International Foundation forAutonomous Agents and Multiagent Systems.

Pita, J.; Tambe, M.; Kiekintveld, C.; Cullen, S.; andSteigerwald, E. 2011. GUARDS — Game Theoretic Securi-ty Allocation on a National Scale. In Proceedings of the10th International Conference on Autonomous Agents andMultiagent Systems, 37–44. Richland, SC: InternationalFoundation for Autonomous Agents and Multiagent Sys-tems.

Rogers, B. W.; Palfrey, T. R.; and Camerer, C. F. 2009. Het-erogeneous Quantal Response Equilibrium and CognitiveHierarchies. Journal of Economic Theory 144(4): 1440–1467.

Shieh, E.; An, B.; Yang, R.; Tambe, M.; Baldwin, C.; DiRen-zo, J.; Maule, B.; and Meyer, G. 2012. PROTECT: ADeployed Game Theoretic System to Protect the Ports ofthe United States. In Proceedings of the 11th InternationalConference on Autonomous Agents and Multiagent Systems.Richland, SC: International Foundation for AutonomousAgents and Multiagent Systems.

Tambe, M. 2011. Security and Game Theory: Algorithms,Deployed Systems, Lessons Learned. New York: CambridgeUniversity Press.

Vanek, O.; Jakob, M.; Hrstka, O.; and Pechoucek, M.2011. Using Multi-Agent Simulation to Improve the Secu-rity of Maritime Transit. Paper presented at the 12thInternational Workshop on Multi-Agent-Based Simula-tion (MABS). Taipei, Taiwan 26 May.

von Stengel, B., and Zamir, S. 2004. Leadership withCommitment to Mixed Strategies. Technical Report LSE-CDAM-2004-01, CDAM Research Report. London: Lon-don School of Economics.

Wright, J., and Leyton-Brown, K. 2010. Beyond Equilibri-um: Predicting Human Behavior in Normal Form Games.In Proceedings of the 24th AAAI Conference on ArtificialIntelligence, 901–907. Menlo Park, CA: AAAI Press.

Yang, R.; Kiekintveld, C.; Ordóñez, F.; Tambe, M.; andJohn, R. 2011. Improving Resource Allocation Strategy

Against Human Adversaries in Security Games. In Pro-ceedings of the 22nd International Joint Conference on Artifi-cial Intelligence, 458–464. Menlo Park, CA: AAAI Press.

Yang, R.; Tambe, M.; and Ordóñez, F. 2012. ComputingOptimal Strategy Against Quantal Response in SecurityGames. In Proceedings of the 11th International Conferenceon Autonomous Agents and Multiagent Systems. Richland,SC: International Foundation for Autonomous Agentsand Multiagent Systems.

Yin, Z.; Korzhyk, D.; Kiekintveld, C.; Conitzer, V.; andTambe, M. 2010. Stackelberg Versus Nash in SecurityGames: Interchangeability, Equivalence, and Uniqueness.In Proceedings of the 9th International Conference onAutonomous Agents and Multiagent Systems, 1139–1146.Richland, SC: International Foundation for AutonomousAgents and Multiagent Systems.

Yin, Z.; Jain, M.; Tambe, M.; and Ordóñez, F. 2011. Risk-Averse Strategies for Security Games with Execution andObservational Uncertainty. In Proceedings of the 25th AAAIConference on Artificial Intelligence, 758–763. Menlo Park,CA: AAAI Press.

Young, S., and Orchard, D. 2011. Remembering 9/11: Pro-tecting America’s Ports. Coast Guard Compass: OfficialBlog of the U.S. Coast Guard (6 September) (coast-guard.dodlive.mil/2011/09/remembering-911-protect-ing-americas-ports).

Bo An is a postdoctoral researcher at the University ofSouthern California. His research interests include artifi-cial intelligence, multiagent systems, game theory, andautomated negotiation. He is the recipient of the 2010IFAAMAS Victor Lesser Distinguished Dissertation Award.He also won an Operational Excellence Award from theCommander, First Coast Guard District, and theAAMAS’2012 best Innovative Application paper award.He received a Ph.D. degree in computer science from theUniversity of Massachusetts, Amherst.

Eric Shieh is a Ph.D. student at the University of South-ern California. His research focuses on multiagent sys-tems and game theory in real-world applications. Prior tojoining the doctoral program, he worked as a softwareengineer for Northrop Grumman. He obtained his M.S.in computer science from the University of Southern Cal-ifornia (2005) and B.S. in computer science fromCarnegie Mellon University (2003).

Rong Yang is a Ph.D. student in the Department of Com-puter Science at University of Southern California. Herresearch interests include artificial intelligence, game the-ory, and human decision making. She received her M.S.degree from the Department of Electrical and ComputerEngineering at the Ohio State University and B.S. degreefrom Tsinghua University.

Milind Tambe is a professor of computer science at theUniversity of Southern California (USC). He leads theTEAMCORE Research Group at USC, with researchfocused on agent-based and multiagent systems. He is afellow of AAAI, recipient of the ACM/SIGARTAutonomous Agents Research Award, as well as recipientof the Christopher Columbus Fellowship FoundationHomeland Security Award and the influential paperaward from the International Foundation for Agents andMultigent Systems. He received his Ph.D. from the schoolof computer science at Carnegie Mellon University.

Craig Baldwin is a research scientist and senior programmanager at the USCG Research and Development Centerwhere he manages a portfolio of research projects involv-ing application of game theory to improving port securi-ty, analysis and modeling of Coast Guard command, con-trol, communications, computers, intelligence,surveillance and reconnaissance (C4ISR) systems, andmajor systems acquisition alternatives analyses such asthe Offshore Patrol Cutter C4ISR systems alternativeanalysis.

Joe DiRenzo III is the chief of the Operations AnalysisDivision at Coast Guard Atlantic Area Portsmouth Vir-ginia. In this role he directs all risk, operations analysis,modeling, and innovation engagement for the com-mand. DiRenzo is a retired Coast Guard officer and for-mer cutter commanding officer. He also serves as theCoast Guard Chair at the Joint Forces Staff College inNorfolk VA. The author of 300 published articles, DiRen-zo was the runner up for the Naval Institute’s Author ofthe Year in 2010. Additionally, he is the national codi-rector of the Joint-University of Southern CaliforniaNational Maritime Risk Symposium. Finally, DiRenzo is aprofessor of graduate intelligence studies at AmericanMilitary University.

Lieutenant Commander Ben Maule is the lead opera-tions research analyst for the Operations Analysis Divi-sion at Coast Guard Atlantic Area in Portsmouth, Vir-ginia. Maule directs all research projects for theoperations analysis division. He received his B.S. inmechanical engineering from the Coast Guard Academyand his M.S. in industrial engineering from Purdue Uni-versity. Maule is a Coast Guard aviator and has flownCoast Guard helicopters in Kodiak, Alaska, and TraverseCity, Michigan.

Lieutenant Garrett Meyer is the chief of the IncidentManagement Division at U.S. Coast Guard Sector Bostonwhere he is responsible for managing the Coast Guard’sPorts and Waterways Coastal Security, Law Enforcement,Maritime Environmental Response, Search and Rescue,and All-Hazards Response missions in the Port of Bostonand along the shores of Massachusetts. He has a B.S. andmaster’s degree in management from the United StatesCoast Guard Academy and the University of Phoenix,respectively.

Articles

110 AI MAGAZINE

Visit AAAI on Facebook and LinkedIn

AAAI is on Facebook and LinkedIn! We invite all inter-ested individuals to check out the Facebook site bysearching for AAAI. If you are a current member ofAAAI, you can also join us on LinkedIn. We welcomeyour feedback at [email protected].

PROTECT — A Deployed Game-Theoretic System for Strategic ... · Coastal Security (PWCS) Mission....

Documents

Transcript of PROTECT — A Deployed Game-Theoretic System for Strategic ... · Coastal Security (PWCS) Mission....