Randomized Testing for Robotic Plan Execution for ...nah/bibtex/papers/saigol2010ieeeauv.pdf ·...

9
Randomized Testing for Robotic Plan Execution for Autonomous Systems Zeyn Saigol*, Frederic Py t , Kanna Rajan t , Conor McGann , Jeremy Wyatt* and Richard Dearden* * School of Computer Science University of Biingham, UK Email: {zas.jlw.rwd}@cs.bham.ac.uk t Monterey Bay Aquarium Research Institute Moss Landing, California Wizbots, LLC San Carlos, California Email: [email protected] Email: {y,kanna.rajan}@mbari.org Absact-Autonomous underwater vehicles (AUVs) are com- monly used for carrying out pre-planned oceanographic surveys, but there is increasing interest in optimizing these surveys by performing onboard re- planning. MBARI has developed an advanced AUV control system, the Teleo Reactive EXecutive (T- REX) that enables the vehicle to survey areas in more detail if biogeochemical markers indicate the presence of a target feature, and even to follow dynamic ocean phenomena such as fronts. T-REX uses artificial intelligence (AI) techniques in constraint-based temporal planning together with a layered control architecture that allows plans to be generated and executed onboard. One challenge of onboard plan synthesis and execution is that the power of the system to generate different behaviors makes it hard to test in simulation, and failures at sea are costly. We introduce a randomized Monte-Carlo method based test approach that executes hundreds of simulated missions with each mission presenting different inputs to the planner, and checks each output plan for validity. The approach sets environmental parameters to exercise T-REX's domain model, and it is fully configurable. We describe how the Monte-Carlo tester integrates with T-REX, how we have incorporated it into our testing process, and the benefits for system reliability that have resulted. We also highlight our experiences in discovering bugs both in simulation and for science surveys in waters off Northern California. 1. INTRODUCTION In oceanography, autonomous underwater vehicles (AUVs) have emerged recently as cost-effective and capable robotic vehicles. They have sufficient power and payload capacity to support the diverse suite of advanced sensors required to resolve interacting physical, chemical, biological and geolog- ical phenomena. They are being used to study transient and rapidly evolving events in coastal waters that are spatially and temporally unpredictable. Fig. 1 shows our AUV platfo which can operate to depths of 150Om. Until recently, most AUV control systems [1] were a variant of the reactive Subsumption based architecture [2] relying on manually scripted plans generated a priori. The controller is responsive to its immediate environment (e.g., passing through a ont with a temperature gradient, detecting an obstacle in the vehicle's path), but generates commands disregarding impacts to ture actions or state. This prevents the substantial adaptation of mission structure essential to im- proving operation in a dynamic environment and to pursuing Fig. 1. The MBARI Dorado AUV being deployed om its support vessel the RN Zephyr. Fig. 2. Coastal ocean phenomena targeted for studies using adaptive control on a robust operational AUV. Along the black surface tracks om a Sept. '07 mission, the AUV executed a vertical Yo-Yo to map the water column in high-resolution for key phenomena such as onts, intermediate nephaloid layers (Is), and phytoplankton blooms and patches. Image Courte: John an, MBA unanticipated science opportunities. Many of the complex multi-disciplinary phenomena we seek to understand in coastal waters have unpredictable spatial and temporal expressions. For example, Fig. 2 shows a rep- resentation of three selected phenomena of interest observed simultaneously in the region of observation: onts, Inteedi-

Transcript of Randomized Testing for Robotic Plan Execution for ...nah/bibtex/papers/saigol2010ieeeauv.pdf ·...

Page 1: Randomized Testing for Robotic Plan Execution for ...nah/bibtex/papers/saigol2010ieeeauv.pdf · control architecture that allows plans to be generated and executed onboard. One challenge

Randomized Testing for Robotic Plan Execution for

Autonomous Systems

Zeyn Saigol*, Frederic Pyt, Kanna Rajant, Conor McGann:j:, Jeremy Wyatt* and Richard Dearden*

* School of Computer Science University of Birmingham, UK

Email: {zas.jlw.rwd}@cs.bham.ac.uk

t Monterey Bay Aquarium Research Institute Moss Landing, California

:j: Wizbots, LLC San Carlos, California

Email: [email protected] Email: {fpy,kanna.rajan}@mbari.org

Abstract-Autonomous underwater vehicles (AUVs) are com­monly used for carrying out pre-planned oceanographic surveys, but there is increasing interest in optimizing these surveys by performing onboard re- planning. MBARI has developed an advanced AUV control system, the Teleo Reactive EXecutive (T­REX) that enables the vehicle to survey areas in more detail if biogeochemical markers indicate the presence of a target feature, and even to follow dynamic ocean phenomena such as fronts. T-REX uses artificial intelligence (AI) techniques in constraint-based temporal planning together with a layered control architecture that allows plans to be generated and executed onboard.

One challenge of onboard plan synthesis and execution is that the power of the system to generate different behaviors makes it hard to test in simulation, and failures at sea are costly. We introduce a randomized Monte-Carlo method based test approach that executes hundreds of simulated missions with each mission presenting different inputs to the planner, and checks each output plan for validity. The approach sets environmental parameters to exercise T-REX's domain model, and it is fully configurable. We describe how the Monte-Carlo tester integrates with T-REX, how we have incorporated it into our testing process, and the benefits for system reliability that have resulted. We also highlight our experiences in discovering bugs both in simulation and for science surveys in waters off Northern California.

1. INTRODUCTION

In oceanography, autonomous underwater vehicles (AUVs) have emerged recently as cost-effective and capable robotic vehicles. They have sufficient power and payload capacity to support the diverse suite of advanced sensors required to resolve interacting physical, chemical, biological and geolog­ical phenomena. They are being used to study transient and rapidly evolving events in coastal waters that are spatially and temporally unpredictable. Fig. 1 shows our AUV platform which can operate to depths of 150Om.

Until recently, most AUV control systems [1] were a variant of the reactive Subsumption based architecture [2] relying on manually scripted plans generated a priori. The controller is responsive to its immediate environment (e.g., passing through a front with a temperature gradient, detecting an obstacle in the vehicle's path), but generates commands disregarding impacts to future actions or state. This prevents the substantial adaptation of mission structure essential to im­proving operation in a dynamic environment and to pursuing

Fig. 1. The MBARI Dorado AUV being deployed from its support vessel the RN Zephyr.

Fig. 2. Coastal ocean phenomena targeted for studies using adaptive control on a robust operational AUV. Along the black surface tracks from a Sept. '07 mission, the AUV executed a vertical Yo-Yo to map the water column in high-resolution for key phenomena such as fronts, intermediate nephaloid layers (INLs), and phytoplankton blooms and patches. Image Courtesy: John Ryan, MBARI

unanticipated science opportunities.

Many of the complex multi-disciplinary phenomena we seek to understand in coastal waters have unpredictable spatial and temporal expressions. For example, Fig. 2 shows a rep­resentation of three selected phenomena of interest observed simultaneously in the region of observation: fronts, Intermedi-

Page 2: Randomized Testing for Robotic Plan Execution for ...nah/bibtex/papers/saigol2010ieeeauv.pdf · control architecture that allows plans to be generated and executed onboard. One challenge

� N:ephal.oid. �:IS aM. bl.coI'i:J-, E<ch of 1h.es:e pherLorrI:!na. call. OO(:'lJi" O'm' a w& � of. sc� :from 'I.ho1.isan.ds of Jci1:,rne'le:lS (frcm:t:: aM. bb:«ns:) down. 1.0 'l.alslh1.ll\.d.Ms of :rre1.ers, and ach :is � by s:I1oJlg: s:pa.1.b-'le:rrp:ral. oieperuiaLce,

TO mi� u..e technical short-comin,g::: of �V'b1J:s: .A..UV ccm.1.1Oll.ers, and 'b stlJdy such dyna.rctc p:tOOe!ss:es e:ffic::i.eJJ.lly aM. 00S1. -effe::1:iv:el.y; we have dev:el.oped aM. depbyed an. 00.­b:erd. ::da.p1:i.ve ccm.1.1ol sys'Iem 1.ha:l intega.'leS art:ific:ial. intelli.­,g;:m.ce (Jl,J) based planlI:irL,g: aM. ptOl:a.bilis\i:: sIa.'Ie estimati.on in a hybili e::taC1.Ltive, P1f!vb'usly 1lE!� work �sen1.ed u..e C012 primipl:!s 1.Ill.d.etlyirLg: Q1H' system u..e T el.eo-Re::ctive EXac:'U1:ive (r -REX) [3) aM. in'legalirL,g;: 'OUs sys'Iem willL ptOl:a.bilis\i:: sate est.1roatioJJ. [4), Usirtg: such a 1000tic: d.evic:e sc:i.en.tists have been abl.e 'b � �cis:eJ.y w:ilJlin. f:ea.1:Jties of in1.e1lE!st in u..e cCGSlal. oc:ean. fur u..e fusI. time, aM. do so

fully au'bJJ.OmJ1J.Sl.y, ThS1in.g: 1h.e ent:llE!ty of an. a:u1.oJl.c«ro1J.s sysI.em whic:h d:eals

wi.1Jt dyna.rctc aM. 1J.Me:t'\ain env1t'cmrren'lal. condi\ions :is c:h.a.l.len,gjrLg. Plan synu..e3s can. :fail, inc:om!c:t plans can. be �:ta'led or exe::u1ion tirre :fail1J.1lE! can. OO(:'lJi" eilher due 1.0 an. �inc:om!c:t I'I:'Oiel or due 1.0 'IJl'l.eXpec:'led :e:tOg,eLOlJ.S obs:er'ia.1ions, Bec:a:us:e Q1H' plans rep1lE!s:ent a rrultitlLde of exe::'UD tra.c:es, c� phJls textually :is CcrL(:ep1.1J.ally :i1Ifeasi.bl:! if not inc:om!c:t , Variations :in. a � � for irLsenc:e can. lead 'b a q'Ualil.a1:iv:el.y Mferen.t plan, S:eoon.dJ.y, fur .A..I planJl.ers a driv:in.g f::.::tor in how 1.0 eval.� 1ObusUI.ess of exe::u1ion :is cOW!:ta.gl:! of PJSSibl.e v.:dues of. ;ill variabl.es whic:h 1lE!�nt v.:did plans (5), :However such rre1Jic:s are hard. 1.0 d.e1.etmirJ.e yW!n � flexibility in Q1H' plans, An <dd.itional �a.tioJL can. ari:s:! wh.en. planlI:irL,g: aM. exe::UD :is in'le:ttwiMd such as in I -REX; bec:aus:e u..e sys'Iemcall. ra::over flmnoff-:norrinal c:oJI.Iiiti:o\s wi.'Ol. dynamic: 1lE!phnJ\.iIL& siI.'IJaticms can. ::.ris:e WM1lE! by 1lE!pe4'1ed replanJcil'Lg 00(:1J.:IS 1.0 � fur shortc:crnirLg'; in ro:d.el c:ovetagl! , FilIaD.y, siJI.c:e we ailE! dealing w:il1. environrrentLl1lE!sp:G'ISC! 1.0 plan &e(:1J.tion by a roOO1ic: � u..e st.oc:has1ic: na:l1J.1lE! of u..e � :is often hard. 1.0 pradic:1. ,

In S1J.(:h cases I'I:'Oiel <:hec:kiJL.g: [6) as: a I'I:'SJIS 1.0 test. plan and &e(:1J.tion vari.a1ions for cow!� c:aJIJI.01. '\aclde u..e �ty of 1lE!al wedd appli::aIions ['1), Random. tes:t.in.g: [8) 00. u..e 01JI.er !wJd, has been shown 1.0 petfoml. as: well as, aM. mque:n.lly bet'ler 1.han, ::l.r\Lc:tllM 1.est.:in..g: fur many t� of ptO� [91 [10), It has: been applied 1.0 'lestiI'Lg softWailE! for a:u1oJJ.c«'W\ls �1S by [it 1 [ 12],

We have devebped a tec:lmique in whlc:h u..e test lwMss � p:6Sible 1t'<!jec:1.ori.es off -line 1.0 �ta!.e a V<IIie1.y of S1ales w hi::h u..e W!hlc:le can. enter, In :9JI'I:'Ia WOWS Q1H'tec:lIlIique :is :similu 1.0 a MolL'Ie -C3t'b sirrruh1.bn, as: we ailE! prod1J.cin..g: a s:er:ias of. raM.om en'li1OJlrrsJ.'Ial. obs:er'ia.1.bns, whic:h ailE! us:ed as: inpu.t 1.0 u..e plaJlM:; and ptOIi1J.(::in..g: an. �'led 1lE!:rult :from u..e series of. :t1J.l\S , O'Uf sol1llion :is JLOW!l :in. that we apply 1h.e rw1odology 1.0 an Jl,J system wi.1Jt a coo1:in1J.01J.S

inpu.t sp:.::e , aM. Jwd.-l.o-:specify � behaYior fur a yve:n. �uel'J.C:e of .15, SirLc:e 1h.e planJLer we us:e albws a variety of. spec::ifu 1t'<!jec:'brias 'b be exe::'U'le1i, u..e .A..UV call. end 'Up in a variety of sates,

"The rest of.1Jt:is IfI,pet":is �d as: folbws: in Sec:1ion n, we p1O"lide I'I:I:I1lE! � 00. Q1H' system 1.0 m:tI1vale Q1H' tes:t.in.g: rw1odology, S:ec:tion m COW!:IS u..e tes:t.in.g: solu1ion we dev:el.oped ,gjYing: de1aiIs of 1h.e lIat1I.ess and u..e p1O(::ess of. 1.esI:in.g. Sec:D IV d.esc:ribes u..e v.:dida.tioJL :rre1.hcds for u..e 'lestiI'Lg software irLc:lILdbI..g: full. 1lE!:rul1S at ;;::ea, FilIally we c:cm.c:11Jde in S:ec:tion V aM. dtic:1J.SS fut1J.1lE! work,

II. BACKcJl.01..IND

I -REX :is an. adaptiW!, arIifu:ia.l in'le�e l:a.s:ed coo-1J:olhr 1.ha:l p1O"lides a � fu.mewotk fct' bWld.irL;::taaSOJJ.­i.l'L.g: sys'Iems fur 1lE!al-\ro:tId autm.orrolJ.S veh1c:les, I IS dev:el.op­rre:n.t has: been '\aty!1.ed at SlLfW!y:in.,g: oo:ea.noyaphlc: fua1:Jties whic:h are dynami:: and �-� 1J.JL�dt:1.abl.e,

TO enable u..e nec:essaty �n.:ess 'b a c:han.,gjng: envi-1mII'I:Sl.\ u..e I -REX �nt syn1.hestles plans in -si.1ll, PlanJtiJL.g: aM. exe::'UD are i t1lerfeliled, For plan syn'l.he:::is we us:e u..e � ocmstJairLt-ba::ed EUROP.A..Q phnMr wi.1Jt a d.em:m­stta!.ed N.A..S.A.. :space rnissiJJL .y [ 13 J. [ 14) rros1. 1lE!(::E!nll.y fur u..e Mats Expbra.tioJL RCPI'e:lS [151 [ 16], Our au� arc:hitec:1llfe brln.g::: 1Jt1lE!e key irLJLcmI.1ions fur .A..UV ::da.�D: u..e us:e of �ble plan 1lE!�n'Q1.bns, �tional ccm.1.1ol aM. off-l:in.e le<d:Jrin.g: 1.0 info:tm 00. -b:etd � es1.1rralion.. In 1Jt:is sec:tion we de1.ail smne of u..e key ccm.c:eplS 1lE!levant 1.0 fuxibl:! plan 1lE!�senta.tioJL whic:h � Q1H' apptmc:h 1.0 1.estiJLg. De1.ails of � ccm.1.1ol and � :estiIm.1ion are 0015& u..e srope of. 1.hls: p:.per and can. be fuund in [3] aM. [4),

.A.. I-REX �nt � u..e S'el1Se--Pkm-A,d (SP.A..) � 1.0 ccm.W!:tt p:.ttial phJls describing a set of desired. S1ales in'b ;:cti:o\s, Embedded wiIhlrL 1h.e �t:is: aEUROP.A..Q planM:I;: us:ed 1.0 resolve fla1.w 'Ol.at have no causal S'Uppo:tt in u..e p3t'1:ia.l plan s\ru.c:1llfe {for � ""The .A..UV has: 1.0 be a1 ooord.:in.ate X blLt it :is: (:'IJtt"eJLlly a1 Y'). "The task of u..e �t :is 1.0 e:J'\SII1li! flln S'Uppo:tt for 1h.e � plan it has yme:ta1.ed aM. 1.0 d:isp:4c:h 1.hls: plan in.c:rerreJL1.aD.y fur exe::u1ion,

"The EUR OP Aa planJLer us:es a <hm.ait1 m:::odef wril.ten in a dec:latativ:e � (called New D:m:lain. D� Lan­g:ua,;;! or NDDL), 'by!1lLer wi.1Jt ilIit:ia.l condi\ions and ,pls,

H.; � AmJIi:�l,uml blook � of most m:ldel-l:I1Si!d phTlM:t> , Ollr @:JIl.plDsis ill 'Ol:i;: JIlPI!f:i;: OTI 1.@SIiJJg'\ll! dmnaiTl model,

Page 3: Randomized Testing for Robotic Plan Execution for ...nah/bibtex/papers/saigol2010ieeeauv.pdf · control architecture that allows plans to be generated and executed onboard. One challenge

Rg; 4. Tolo!:T!!= wi\ll �:bl!: 1!:m.po:ml irl� and OOT!!=1:IIIiTl'\:; be:twe:e:n 1.oke:T!!= on 1,J)e: � oX 0'\l'I!t 1ir1"Le:llrJe::;:, Tolo!:n OO:lll.1ioTIs can be: � syc)l as s;oown to!- 1,J)e: a.c1ioTI 'b �� 1.0 :D!:p:!e:Sll:nt Yll001l1ro1b.b.i.li.t' iJ[ stl.te a.c)l:i!:ve:m.e:n1..

1.0 00J\SIfU(:1. a :El:!t cf 1.eIl:p:tal relaIicms 'Otat rrus1. be 1.fUe a1. '\he s\at1. tirre 1.0 oon9.i.t\l1i! a plan. [17], The:El:! ND DL m:dels :iJLcl\ld.e � a'OO1.lt '\he Ihysic:s d '\he W!hi::le, i.:e how it t!:!:s:p:m.d.s U :ex1.:etnal st.irrrul.'\IS and :in.1i!:mally d.riv:en Ps, By � 1.h:es:e t!:!1a.1:i.ons fatWa1d 1JSin.g: Sircpl:e Th� N:etlrot"ks [ 1B] and applying:gpal. 00llSliaill:IS, EUROP� o;;m s:el.:ec:t a set of COJJ.d.i'lbns 'Otat sho1JH b:e 1.fUe in '\he fut\l1!:!, whet!:! smne of 1.h:es:e COJJ.d.i1.io1l:s: will oCJm!:s:p:md 1.0 ::.::1:i.ons '\he �:n.t :rrrust. 1ak:e , The pla1IJL:er o::m. bao1ru:a.c:k and tty a:n.oOt:er p:dh d1J:lin.g: s:ea1Ch if a pl o<ll'lll.01. b:e t!:!::.::hai, It is also oapill:e cf d.i9:3:td.ing �h2Irable Ps, Fig. 3 g,r:es ;;m abs1Jact vi:ew cf how m:d:el-based plan.:n.:efs a:I2 OOl'ISIfII.o1.:d, O\lf :e� in '\he pa.pc!t" is fwls:ed 011 1.2sting '\he dom:dn m:d:el,

EUROP"ol"Q '\IS:eS titnelit1e51.o �t 'OI.:e :e>cl\ltion cf '\he 'oro!I.d. S1a.'Ia; a'bInio :entili:!s U\aI. are :ins1an'lia:l.a:! 011 'OI.:ese time1in:es 1.0 t!:!p1!:!S:e:n.t (:'I.Jfi2IJ.t $\a'la are oall.:ed l.dt:et15, Th� :evol\l'lion cf SIa1.:e is d:esoribed by 'lime PJ:in.t vaJiabl.:es wiIlli. 1.ok:en:s: w hi::h :el'l.OCd:eflexWfe 1.ok:en. ::wt aM :eM, Fig. 4 shows a pat1:i.al plan. willL 'bk:e1lS and ronstJajn,ts,

Fig. :5 shews ;;m illllsl.ta:liW! :e� cf '\he d.:ev:el.oprre:n.t cf a plan. mg fuur time1in:e� Goal, Path, Navigation and COO'CI"IaM., 1bk:e1IS 011 '\he Goal tirrelin.:e r:ep1!:!s:e:n.t mis­sion gpal.s, and a:I2 d.e::�d t!:!O'II.t'3v:el.y inl.o SlLbPs whl.c:h a:tl:! i1ISt;;m'lia:l.a:! by 1.ok:en:s: 011 bw:er 'I.irrel:i:n.:e, JI. t!:!p1!:!S:e:n.'1a:li'f:e :exan:pl.:e cf a m:d:el fOr Jl.UV >cl'llICle S1J..tV:eYS fullows::

voJ,1Jm;..."s •. n::wy (.Kl, Y1 ,.K .. , y .. , Hill..D�h, M;,.K..P�pth)

� m;.t....by ';0 (.Kl ,Yl, Hill..P�pth) "

.:!'t:.:tt.:!' [� O, 0] ';0(.K2, Y2' Hill..D�h,M;,.K�pth),'

�lW [�O, 0] .;o(.K,,_liY ...... l' Hill�pth,M;,.K�pth),'

�t.:!' .;0 (.K .. , y,.. ,Hill..P�ptJl,. M;"K..D�p:h),'

Unl:ik:e a 1..t<dilio:n.al. � time -1.agg;:d o� s:eq'\L:e:n.o:es, suoh fl.:exible phJ\s l:eave 1OOI'I:'L fOr adap1aIio:n. a1. :exao1.lt.io:n. time, Wh:en '\he :exao\l'li'f:e ronsid:eIs wh:en 1.0 � a '\ask, it �s ilIfoma:lb:n. � '\he ronstJajn,t :n.:e1.WOfk, �1.:es a time 00\l:n.0i fOr varia.bl:es t!:!p1l1!:EI:!:n.tin,g: s\at1. -times, s:el.:ec:ts ;;m ::.::t1Jal :el!e01.ltion 'Iirre w:ilJci:n. '\he l:owl.d, and s\at'Is '\he '\ask at U\aI. time, � fl.:exible plaJ\s 1J\.:et":eti:Jr

:express a ml?ge rt ,P055Wfe OUOOomt5 cf '\he 10001s in�'Iion with '\he :el'L'liion.rre:nl., with:in. whl.c:h '\he :exao\luW! o;;m :elect a1. f\l1I. tirre '\he mJSI app:rop:ria1:e on:e far '\he ::.::t'll.al. :exao\l'lion COJJ.d.i1.io1l:s:, The fact U\aI. COJJSI.tai.:n.ts a:tl:! :explicitly t!:!p1l1!:EI:!:n.1.ad :erts:\I.t":eS U\aI. � oOJJSl..taint p1OJG.9'1ion '\he :eJl:e0\l'li'f:e will t!:!sp:ct goballimils :ex� :in. '\he p1a.n. (:e� ''don't � a '\ask 1JJdil a o:ertam COJJ.d.ifuJI. has: bee:n. :s:a.1.is:fud',),

This fuxib:ility a1. plan. and :eX2O'\Ltion tirre has :m:tIiva.1.ad '\he d.:ev:el.oprre:t cf a 1\"1:112 �d li!s:ting: soluti.oJL, 1.0 1Jl.oio'u.ghly :e:t:e1Cis:e '\he p:r.;;s:i.ble pa1.h:s: '\he softWa:12 oa:n. 1ak:e lxtIh d1J:lin.g: p1a.n. syn,1.h:esis as well as: p1a.n. :eJl:e01.ltion,

III. Tl1ST SYSTBM Dl1SICl'f

O\lf 1.:estin,g: �\l1!:! is bwlt a.tO'IIJLd sirrrula!.oJs willL ino1!:!as:ilI.g: d:eg:l!:!eS cf fid.:eliI.y, Jl.t '\he bw:er :eM w:e hav:e what w:e call a p5�.itn.dator {ar P - Si� whl.c:h has ;;m apprmcima.'Iion cfW!hi::l:e dynarm::s within '\he T -R:EX dom:dn m:d:el, The P -Sim. 001L� s:ev:eta.l pa.tarW.:e:lS 'Otat o;;m b:e c:ha:n.¢ fOr:each 'last, 1.0 oap1nf:e :ex1.:etnal COJJ.d.i1.io1l:s: suoh as: '\he � OOaaJI. 01Jm!:n.t aM sea s\a1.:e , Be:in..g: �r th;;m t!:!al. -time tlti:s: 1.:est :e:n.vironrre:n.t a1l.ows '\he 'last a:n.al.yst. a fiIst. O\lt a1. 1.2sting '\he full syst.:em; b1.l.g: fu.::in.g: oa:n. b:e il:eta'liW! and fast..

Jl.t 'OI.:e Mxt l:ev:el, a Q- Sim :in.volv:es f1.IlI1Ii:n..g: a ph�s­l:as:ed sirroJla.1.of l:as:ed. 011 [20] w hi::h ::.::O'II.ta.'laly niroi.c:s '\he v:ehicl:e dynanios. T -R:EX t\ll\S 011 a olli:n.t and OO1LJL:eots U '\he Q- Sim fIJJIl'Iin.g: 011 ;;m :embedd:ed QNX p1a.1.:fotm via. a 1CP s:o::k:e\ tlcis c:oJIfi.gw:a.'Iion is similar 1.0 '\he my harlWa:l2, Ev:en 1l'Lcn.Ly.. 'OI.:e Q-Sim p1O'Iid:es <I::O'II.ta.'Ia t!:!� cf '\he oo:e::m.io and v:ehl.o1.l1a.f :el'L'li1ortme:n.\ 'beoa:JJs:e :it f\l1IS in t!:!al.­tirre, 01.Ir lo:n.,g:-� soi:!:n.1ifi.o S1J..tV:eYS (:in. :excess of 6 hcnJ.ts) r:eq'l.li1!:! deiica'lad �t t\ll\S, &. 1.:ests in 1ltis :e:n.'Ii1onrrsLt a:I2 t1JJl. �y,

Finany, w:e '\IS:e '\he ;.;JV 011 '\he b:enoh in 01.Ir lab {Of v - Si� in a ocrdiguta.'Iion U\aI. a1bws '\IS 1.0 '\IS:e '\he higL­fid.:eliI.y :mruJa.1.ar fIJJIl'Iin.g: 011 '\he �t p1a.1.:fotm aM fIJJIl'Iin.g: in 001L0:ert with a s:el.:ec:'Ii'f:e set of s::eJ\SO:IS t1J.tJLei 011, This is don:e U 1i!:SI. data pa.fuways fImn '\he ::.::t'll.al. :El:!l'ISO:IS as: w:ell as: 1.0 :el'lSlJ.t":e :syst.:em. p:et:f0l:I'lEL0:e in 'OI.:e JLa1.1W! haidWa:12 COl'Ifi.g:­'II.ta.'Iion, It is t1JJl. prior 1.0 01.Ir a1. -sea. 'lasts. The :exist.:in..g: T -R:EX 1.2stingi1l:fras:l.t1J.\l1!:! aM '\he :exli!rcio:n. b:eing: d.isolJ$:ed in 1ltis �r a:tl:! sh.ow:n. :in. Fig. 6,

D:m:ain m:dels c:ha:n..g;:! SlLbsI.an'IiaRy betwe:en appli.cati:ms; '\he m:d:el fOr ohasill,g: f1mLts :is d.ili:e1!:!JLt :from vol'llICle S1J..tV:eYS :ev:en if 1Jt:ey 001.h t!:!ly 011 a SlLbst.a:n.1.ial.ly l.at;l! comm:m. roi:e­l:as:e , Ea::h so:i2:n.o:e :missi:m U\aI. we '\IS:e T -R:EX far will have its ru"1l set cf ;mE, �;taphi:: s'\a.t1.i:n..g: PJilI.l., d:esif:ed s:ensar ao1.io1I:s: and time -fr.:are , Th.:es:e setli:n..g:s: and Ps are d:ev:el.oped and in1.:e;ta1.ad inl.o '\he NDDL doIm:in. m:d:el Ol:er '\he 001J:lS:e cf s:ev:e:tal. weeks, aM 11e mission is SlLbj:eo1.ad 1.0 :ex'la:n.si'f:e 011 -shot!:! li!sIi:n.g. How:ev:e:lj: Iri:ssicms a1. sea do oooas:icmally f.a.il, :ei'Ol.:er l:eoa'\lS:e of a haidwar:e ar in'l.:erf<l:::e c:oJIfi.gw:a.'Iion ptObl:eIf!, ar 'beoa:JJs:e a COJJ.d.ition :is obs:erW!d U\aI. l:e<ds 1.0 T -R:EX b:eing: 'IIJIa.bl:e U c:raa.'Ia a plan. fOr '\he t!:!rc&iJI.Q.:er cf 'OI.:e Irission, O\lf fOO'\lS is CIL '\he la.tl:e:t; so 01.Ir

Page 4: Randomized Testing for Robotic Plan Execution for ...nah/bibtex/papers/saigol2010ieeeauv.pdf · control architecture that allows plans to be generated and executed onboard. One challenge

Rg; 5. ATI iRlJ!:u1.iYe: plaTl syTl'\ll!Sis: I!:lj:;un� wiIJI ooncum!:TI't. 1im.e:line:s, Ab:i:1:m.c1. � � d.eo:Gn.poso!d.:intl s 000!:>!iiY� l.e::;;s: a)):ruw. tllI!:TlS ... 00 irr;b.TI1liL1.o!d. :in co-�o:nU 'I.im.e:1iTJ!:5 usiTJ.g All Qn Al.<3M � [19] :D!:D:Ii:ITIS, All tllI!:rr;: il!:pie:so!:TI't. � :s:\afIIo!:oo 1ime::;, :s... S;OOWS ... TI :inW s'la:1! l!YolViTJ.g:intl j) ... 00 ::ie,

Rg; 6. T -Plll[ 'l!:>t a:rol'li:I!c1.UI\!:, Tllo!: :m.TlOO� 'l!:>t oon1ioD.e:t is slJOWTl1Lt 'Qlo!: ootl.om. :in mo:\ ... Tld is: 'Qlo!: SUQjo!:c1. of 'Qlis: papl!l:

C1W2Il.t � � 1M SIlbjact of Uris p:.pE!r has be� 1.0 a\L� � 1JS1n,g: 1M P-Sim.

This t<Ill.dom � app1O<Ch :I1.IJ\S rrulti.ple Irissions 1JS1n,g:

cm.ly 1M P -Sin]. which was a'l.lg!!E!n1.ad. 1.0 allow �t!fS 1.0 be w. at :t1JJL-1ime , P-Sim patarre'l.els ail! pW. of 1M dormill. �l w:li'tl�n in NDDL, '1M! � a MW NDDL ccmstJa1nt. (sampl. edPa rarn.et e r), tha1. � 1JS 1.0 w. NDDL � 1JS1n,g: C++ � 1.hat in'l.et<c1S with 1M EUROPAa :e�, Fig. '1 shews a s:ec1ion of NDDL ooi:e, wh:e:!:!:! 1M sampledPararn.eter cOJJSl.r.:dnt is �d 1.0 w. 0l'L:e of 1M k:ey � 1M GP S hit :ta1.:e,

,A. Kry FWcilt�M� �

In ou:r doIraiJl, 1M:!:!:! ail! sm J.2y � :l:!:!pt":es:erojng syst!m inpu1S, tha1. m:d:el 1M d� ��t � iIs 'llJ'l.C::efl.:lil'lty 1DHa:tds which I -PEX has 1.0 be r:esp:msiY":e, Fig. e "Iisualli:es l\.UV 1.:I:aJ'ise::1. (as s:e� f:lOIn aOOv:e 1M s:ea SI.I.:If:.:::e) showin.g;: 1M �t 11.e:::e �t!fS ha:'I:e OJL

v:eh:ic:l:e � , Th:e d.1ff:ering t:taj:ec1.ori:es (indicati.v:e of d.iff:e:l:!:!nt r:eai-world. cOJl.d.ilicms) all:! a :ttSUlt of q'Ual.i1a1:i.v:el.y

CPS : : A cHVo!: { rio a t .g;ps HitRUo!:, o!:H o!:cto!:dD un Ho TI ;

}

.g;psHi1.R ... to!: = ampl.e:d.P ... nmo!:to!:f(CPSJ)lTJ:.ATE.) o!:Ho!:Cto!:® un tiOTl == ,gpsl:JHR ... to!: + miTlHits;

dunHoTl <= o!:Ho!:cto!:dD unHoTl;

ml!:t_by( ITI1LcHvo!: p); O!:Tld s( MotioTlS imUbtOf, Holds m);

Rg; J, ATI NDDL code: �TI\ 'IaleTlllom 1JIe: Nim pll!di.ca'l! 'Ql ... t s:imuh1.e:s obU:in:in,g:... CPS I'I:!!, N:J'I! 1JIe: d.o!:ch:m.1.iYe: 1.o!:mpo:m.l il!:hwrr;: - lbr �ple:, M�"Lby (I.M.ci.i.v� p) W:ill irr;:'la.Tl1:i4'1! ... TI I.M.ci.i. v� tllI!:TI 'Ql ... t �Ilows 1JIe: p�TI't. :piI!d.j:ato!:,

d.iff:e:l:!:!nt plans: �:ta1:ed � :ex.ac'\l1.ad. by I -PEX, It is also an in.d.i::ati.OJL 1.hat ou:r :taJLdcm � app1O<Ch can be :eff:ecti.v:e fur � ou:r syst!m.

Th:e � and. 1.h:ei:r taJl9!s ail!:

• Th:e G PS hi! :ta1.:e (.gpsHUR�), which d.:et!trrtin.:es 1M variability of obW'rin,g: a GP S fix 1.0 l.o:3li:z:e 1M v:ehi:;l.:e OJL 1M s:ea sw::fac::e, Lcw:er hi! -rn:es � l :t01.lgtL:er s:ea sta1.:es th:e:I:!:!fb:I:!:! �1.W2 1M pm.. '0 be adjusl.ai fet' l:mg!r

p:ericds OJL 1M sw::fac::e, Sinc::e 1hls pataIW.:er d.o:es not � v:ehic12 � cm.ly 1M 1.iIn:! 1.0 con:pl.:e1.:e a rnisicm, i.t is not shown :in Fig. e, It taJl9!s in val'\l:e f1oro. 0,1 1.0 4,0 hits p:et' s:eron.d.,

• dust£f jd is 1M SIJITJIl:Bty s1atisti.c: 'IIS:ed. 1.0 d.:e1.:e:lrnin:e if 1M v:ehi.c12 is :insi.d.:e a f:eat'\l:l:!:! of in1.:e:l:!:!st [2 1], varia1:i.cms in Uris variabl2 can Ca\Ls:e a Hidi:en Maikov Mod.:el 1.0 � dynarri.c :!:!:!planrdng: [4], Th:e sci:enti:fic: :3E!1I:SOf

:!:!:!;:d1ng:;: w:er:e goruped. inl.o 39 c1'1lSt!:IS,. � :e<ciL cl'llSt!r

was assoo:ia1.ad. wi1h a specifu p1Ol:ability cf beingill:si.d.:e a f:eatJJr:e, fur t!s1:i.ng: w:e a priori assi.gL val'\l:eS of :3E!1I:SOf

:!:!:!;:d1ng:;: 1.0 0l'L:e of 39 c1'1lSt!fS � 1DNatIis INLs, • tffOf�Xand.tfWfRat£r d.:elin:ea1.:e 1M v:eh:ic:l:e na�­

ti.oJL :e:l1Ot' in m'l.els p:et's:eo:m.d.,. in JLOt11:in.g::: � :ea.s1iJL,g:;: (1JS1n,g: SIaJ\datIj �1ion 1.:etI'i:'IiJl.gy), A.s 1M l\.UV is abI.:e 1.0 � cm.ly w� OJL 1M swf..:::e, Uris :esror m:d:els ia:;'brs :such as � :enor and. d.rifI. w�

Page 5: Randomized Testing for Robotic Plan Execution for ...nah/bibtex/papers/saigol2010ieeeauv.pdf · control architecture that allows plans to be generated and executed onboard. One challenge

Fig. 8. Plots of the vehicle path for a specific mission, showing the effect that altering each of the parameters has using the P-Sim. It also shows the difference between inter-mission (left-side figures) and intra-mission (right-side) sampling for the errorRateX/Y (8b) and dx/ dyNoise (8c) parameters. Note that the bottom-right diagram with intra-mission dx/ dyNoise sampling is similar to the standard diagram (top-left), albeit with different transects.

running open loop in the water. This error will manifest only when a GPS fix is obtained and ranges from -0.9

to 0.9 meters per second. f dxNoise and dyNoise are used to model instantaneous

large corrections due to uneven currents which often cause the vehicle to be pushed off course. We use a range from -0.6 to 0.6 meters per second.

The ranges for sampling these parameters were chosen to cover the most extreme values found at sea.

B. System Details

The test system has two primary components, a batch runner and a Monte-Carlo controller, both driven by config­

uration files. The batch runner reads a configuration file that specifies how each parameter should be varied and then writes a separate configuration file for each mission it spawns. It then runs several missions simultaneously on our multi-processor Linux test servers. Each Monte-Carlo controller in-tum reads

its configuration file, and then launches a P-Sim. Fig. 9 shows

Fig. 9. High-level operation of the test controller.

a high-level schematic of the system.

The test controller allows a parameter to be set in two different ways: inter-mission, where the parameter is changed between missions but kept the same throughout individual missions, or intra-mission, where the parameter is altered during a mission (see Fig. 11). For the intra-mission case, the

batch runner uses a random number generator (RNG) to create a different random seed for each mission. This seed initializes the RNG that sets parameter values during the mission.

Fig. 10 shows a partial example of a batch-runner con­figuration file, in which the <BatchSampler> element

specifies an RNG that generates a seed for each mission, and the <Sampler> element is a template for the RNG used to actually generate parameter values. The seed for the batch-level RNG can either be specified in the configuration file (to create a completely deterministic system, which may

be useful in debugging), or generated automatically. Note that each parameter is configured independently. A second configuration choice is what kind of a random distribution to sample the parameter from: either a normal or a uniform

distribution can be used.

As a typical run of the test controller executes several hundred missions, manual analysis of the output is infeasible. We have implemented a simple and reliable oracle for the presence of a bug: whether or not any missions terminate pre­

maturely. This is a good indicator because when an execution time failure occurs, T-REX attempts first to replan, and then to remove or rearrange mission goals. If this too is infeasible, it usually indicates an incorrect or inadequate domain model or over/under constraining this model. In such circumstances

the system gives up and aborts the mission, and log files then provide a reliable oracle for success or failure. If the mission completes on time and T-REX exits cleanly, it is very rare for the behavior generated to be incorrect (but we have regression tests independent of the random test controller that we use to

check for such errors, over a limited set of fixed conditions).

Page 6: Randomized Testing for Robotic Plan Execution for ...nah/bibtex/papers/saigol2010ieeeauv.pdf · control architecture that allows plans to be generated and executed onboard. One challenge

H.g; iO. Sl!C1.i)n t:rom a h:Ltc:l'I oo1'll'Lgllm.1.i:1ll XML tile, :>l'IOWin,g!Jle: oomp1e:1.e OOll�:m:Ii.oTl Jbf one: �1e: (� n:o�t�X),

C, Ttof.!l tlg ProXtofof

1112 nzw 1ZI. cOJJ.1.ioller SlLbsl.aJL1:i.any au.;rren.1.s 'I.M 1ZI. p1OOai'll:mS we h<d :initiated when. d.esigdngT-REX, We n.ow 'IJS:e 'I.M fbllcINirLg :rre\hcdobgical. steps for 'I.M O"letal.l 1ZI. p1OOess,

1) 1112 P -S:im:is 'I.ISSi 'by 'OI.e d.eveloper to :in'l2no,ga1.e 'I.M dcamiJL rr.:deJ.'s CDm!ctrLess a:: a fus:t. cut T -REX :is d.e'l2� y....en iienti::al. ,gp:.ls, :initial ::l.a'l2s and exeC1.ltion 'I.1rre � f:rcm 'I.M P - Sin\. 11. will ptOO.1.lCe 'I.M sarre plan. and 'I.M�01!:! e:l:eC1.ltion 1f<I::e � time ,

2) If m:d.e11.esl:in,g: cleaJs 'OI.e :initi.al. hlJtlile, 'I.M 1ZI. analyst m.:m.� che::ks 'I.M system log;: and O1.ltp1.lt files 11at el'l.OOie 'I.M plan 11at was gM:ta'l2d., \Vhil.e la.00ri01Js, 'I.M validaI.ion ptOVid.es an au'Ol.en1icated. 1.:tajectoty :in exeC1.ltion space , Prier 'b fIIl'Il'rin.g: Q1Jf 1ZI. COJJ.1.iOll.el; we c� an eJ-:.e::u1i.OJJ. 1f<I::e ag:dnst. 1.hls: "gpld. s\anda:td." , ;.. d.ilierence c01.lld. PJ1.en.ti.al.ly flag 'OI.e pre:::en.ce of a bug Of :ind.t:a:le a rr.:deJ. chang!:! MC:ess:i.1.atin.g: a :l!:! -validaI.ion,

3) Ran.dorni.:z:ed. 'I2st:in.g: 1JS:in,g: Q1Jf 1.eSl. cOJJ.1rol1er :is then. :ini1:ia'l2d. to explo:l!:! 'I.M pamne'l2r �e 1JS:in,g: 11e key pamne'l2r;, Typi.cal.ly we fIIJL 'I.Mse 'I2sts 1.lJLSILpe:t'l":isai �t,

4) In <ddition to 'OI.e �t tests, we also f1.lJL hand.­c:taf'l2d. Q-S:im 1.es:\s as a va1ida1i.OJJ. of how 'I.M vehi::le will:resp:m.d. 'b �d. cC«'LTL"W'ds, Unlike 'I.M :tan­dmni:z:ed. tests, th:is elfutt a:u'I.Mn'licates 'OI.e m:d.el with a spoo:ifu plan.-exe::ution1.i'!je:;'b!'y,

5) Ptior to a sea 1Jial, we :in.c:l!:!a::e 'I.M fii.eli.ty of 1.estiJ'Lg 'by fIIl'IJtiJL.g: OJJ. 'I.M vehicle wi1h 'I.M V -S:im a:: p1!:!vbusl.y 1'L01.ad ,

IV, V WDKl'ION AND RESULTS

A. Bug It!,itdioo

1112 f1JJl.darrSJ.1.al. qlJ.es1i.OJJ. we pJSai for 'I2st:in.g: Q1Jf phll.­n:ing: and. e:l:eC1.ltion system was � aM � «oft 'I\Q}If q t;(j)� tlg frai.f.l.u i tI � .fC{t\JMm: utldtr 1£ of'-� 10 answer this, we l.ookai at two i:s1.les :in Q1Jf 1ZI. cOJJ.1roD.er:in �

H.g; 1:1. I'n1.et:mis:i::i:1ll Pa::mJ1ll!:1!t ooTl'lml. (1e:n) 'IS, iTrt:m.-mis;;iOTl JIl.-1lJl\@:1!t 00ll1to1 (:r:i,g3rt), RlrilJ1.e:lo. mis;;iOTl oon1to� �rea.cll Pa::mJ1ll!:1!t (I!.,g. s Ib-:;ea CUl:D!:nt). a �e:d. ViIlUl! i:; proVided. Wi1Jl in1:m.-:mis:i::i)n 00ll1:ro� ViIlues ;ut! 8l!M::m.te:d. t:rom a :lILnoom se:q Ul!nce:,. and. ea.cll mis;;iOTl U!)!;!: a di:lIe::!e:nt :ie:I!d. Jbf !Jle: se:qUl!llCI!,

1) roes a p:.t1.ic1.lla.f � d.:istrib1.ltion for 'I.M key pamne'l2r; cocre close 'b estima.1:in,g: 'I.M S'bChasti.c na1llfe of 'I.M ooean en.'li1onrrsJ.t and. 1.het'efure d.irecll.y �t 1.estirLg:?

2) roes 1JS:in,g: :in'l2r or :in1t'a-miss:i.cJL pamne'l2r COJJ.1.iOl work beSi. for exp:GiJLg: b'u,g:;:'?

10 ad.d.:l!:!ss 'I.Mse, we :inje:;1.ed. b.l,g:;: :into 'I.M T -REX cede and ran 'I.M 1ZI. cm1.roD.er 1.lJJ.d.er various � 'b see if 'I.M bug wa:: d.ete::ted, For 11e fifsI. experiIrsr.1.al. q1.leSlicm, � between. a 'IJJ'd1bmL d.isIrlbution (with :mirclrrum a and. n:BJtirrum�) and a l'I.OH'CI:Il. d.isI.rib1.lti.on, we set 'I.M �r; P. and (J' of 'OI.e l'I.OH'CI:Il. d.1sIribLtion to ICB.tch 11.ose cal.c1Jlalai for 'I.M 'llJdfatm. d.1sI.tibLtion:

p.=a+�,';;p=(�-a)

�. (i)

2 12

;..n 1.es:\s wei!:! fIIJL 1JS:in,g: 'I.M ruL missiOJJ. scenatio 11at T -REX 'I.ISSi in. a re:.en.t sea 1.riaJ. , ;"ne� of 'I.M vehicli's � for th:is :rrissiOJJ. 1WLg 'I.M P -S:im:is shown :in 'I.M left pbI. of Fig. Sa.

It was d.ecid.ed. to � � a:vaila.ble �1.ef fOf � experirrsL1.al. fIIJL of 'OI.e 1.eSl. COJJ.1.1oll.er, Th:is meant we d.id. not <dd.ress 'I.M issILe of which patame1.eIS were beSi. able to :l!:!'f.eal b.l,g:;: :in 'I.M system; hewever th:is was felt to be vef'j d.epell.d.eJLt OJJ. 'I.M nat'll:m of 'I.M b1.l,g:;: we we:l!:! testin.g; so any :l!:!sulIs f:rcm 'I.M b1.l,g:;: we exarnine:l W01.1ld not gl:!� to 011.er PJ1.en.ti.al. b1.l,g:;:, Fu:t11.er we d.id. not ad.d.:l!:!ss 'OI.e issILe of :in1.etaCti.ons between pamne'l2r; (for e� e rrorRat eX and dxNoi s e are not :ind.epend:en.t) s:in.ce Q1Jf prirmfy aim was 'b fuJ.d. cOJditions 11at oa'lJS:e I'I:'Iis3OJJ.s to fail. In add.i1i.OJJ. :::uch � :in1.etaCti.ons do OOC1.lf :in 'I.M :l!:!al.-wo:tld and 'I.M:l!:!fore Q1Jf tes1.s a:l!:! :l!:!p1!:!sen1.al.iv:e,

1112 bug :inje=.tiOl'L 1.es:\s Wei!:! pe:tfom:ed. with five known. b1.l,g:;: 11at h<d previo1Jsly been fixai , E<.:;h fIIJL was for a to1.al. of 200 missicms, and 'I.M c� between l'I.OH'CI:Il. and 1Jl'dfol:m d.isIrlbu1.i:ms was rraie 1JS:in,g::in1.ef-:rrissiOJJ. pamne'l2r COJJ.1.iOl, 1112 pe:tt:en'l.:!,g;:! of rrissioJ"is '!hat fcdl.ad. f:rcme..::h fIIJL :is shown :in Fig. 12,

1112 resul.1S :indioa'l2d. '!hat s;mpling: f10m a n.om:&l. d.1sI.tibu­tion was rror.e likely to exp:e sorce b.L,g:;:, but 011.er b.Lg;: were s:E!:!n :rr.J1!:! often when. � f10m a 'IJJ'd1bmL d.isI.rib1.ltion, Fig. 12a shews 'OI.e pe:tt:en'lag! of � for 'I.M five b1.l,g:;:

Page 7: Randomized Testing for Robotic Plan Execution for ...nah/bibtex/papers/saigol2010ieeeauv.pdf · control architecture that allows plans to be generated and executed onboard. One challenge

Fig. 12. Results of the experiments comparing different sampling strategies. The y-axis represents the percentage of missions that failed due to the injected bug.

examined. Some of the bugs were triggered by a value for a

single parameter near the edge of its range, and for these a normal distribution was better at detecting them, as the tail of a normal distribution extends past the range of the equivalent uniform distribution. Our intuition for the bugs that produced more failures using a uniform distribution is that they manifest

when two or more parameters in conjunction have values near the edge of their ranges. This is statistically more likely to be the case when these two variables are sampled from uniform distributions, rather than from normal distributions where both

variables are likely to be near the mean. Fig. 12b shows that for most of the bugs, inter-mission

sampling produced a similar or larger number of failures than intra-mission sampling. This is because for most parameters, when varied continuously around a mean the longer-term effects cancel out. For example in Fig. 8c, the path is al­

most unaffected when dxNoise and dxNoise are sampled

intra-mission. For the one bug where intra-mission sampling produced far more failures, our intuition is that the bug affects tokens which have short durations and are generated frequently, so frequent parameter re-sampling provides more opportunities for the bug to manifest.

B. Field Results

The first extended run of the test controller detected three

previously unknown T -REX bugs. These were:

1) A temporal constraint violation when scheduling data

Fig. 13. Visualization of a mission over the Monterey Canyon in November 2008. Red indicates high probability of INL presence as detected by on-board sensors. S 1-S5 indicate triggering of 10 water samplers two at a time.

messages which are sent to the shore when the AUV surfaces. These messages are sent regularly every 300 seconds, and additionally in response to an external event (with an average frequency of less than one every

300 seconds). The bug manifested when two messages were scheduled within one second of each other.

2) A precision bug in the C++ code implementing a cus­tom relation for use in NDDL. The bug was caused by problems converting floating-point numbers to integers

for use. 3) A goal inconsistency in the NDDL model, which arose

when a waypoint command was issued close to the time a check-in was due.

These bugs demonstrated that our test controller was able to find very specific temporal inconsistencies where one event

occurs at almost exactly the same time as another. It also showed that the controller is capable of finding bugs in the C++ parts of the system; however, T-REX and EUROPA2 have been stable, and the above bug from the first run is the

only such C++ bug that has been found to date. All subsequent bugs identified by the test controller have been traced to the domain model.

Even though the test controller relies on the P-Sim's low­fidelity vehicle dynamics model, this is sufficient to uncover most model related bugs. Further it is complementary to the

existing test methods; it tests against a variety of inputs, which the P-Sim and Q-Sim do not do individually. For example, since we started using the test controller, whenever we have introduced significant functionality changes in T-REX, we have consistently seen the test controller find bugs that single

runs of the P-Sim have not detected However there are many aspects of the final deployed system which are not tested

by our test controller (such as components interfacing to the main vehicle computer that controls the AUV, and the real­time/synchronization issues such interfacing induces). This is

where the Q-Sim and V-Sim provide utility.

Page 8: Randomized Testing for Robotic Plan Execution for ...nah/bibtex/papers/saigol2010ieeeauv.pdf · control architecture that allows plans to be generated and executed onboard. One challenge

Fig. 14. A flattened visualization of a front tracking mission. This mixes mapping and search phases driven by science intent from shore to re-target a new profile with specific operational parameters. 14b shows the location and bathymetry in the Monterey Bay. 14c shows the compact thermocline data sampled every 5m in the water column.

Finally, two of the three sea trials after random testing was

introduced have been successful, whereas four of the previous

five missions had to be terminated early due to unanticipated

modeling errors.

Fig. 13 shows an example of such a successful at-sea

mission, a volume survey of an INL conducted in 2008. It shows the AUV's transect within the context of the INL

in the water-column, which was used to understand coastal

larval ecology [22]. Fig. 14 shows results from a 2009 front following mission, to detect and sample fast moving

temperature gradients in the water-column. All missions have

been conducted in the Monterey Bay in Northern California.

V. CONCLUSIONS AND FUTURE WORK

Our randomized testing approach has had a substantial

impact in making T-REX robust. We use a Monte-Carlo

controller to sample, from an a priori distribution, a set

of parameters known to impact planning and execution. By

perturbing the autonomous system with randomized values

of the key parameters, our test controller is able to expose

fragile constructs in the domain model, the primary cause

of errors seen to date. And by subjecting the system to

such randomized testing, we suitably replicate uncertainty in

our real-world domain. Finding errors at sea is tantamount

to a loss of a day's worth of ship time (approximately

$ll,OOO/day) and countless hours of preparation; finding and

fixing bugs at sea in dynamic coastal conditions is often

nonviable.

Our Monte-Carlo test methodology is not domain specific.

It is novel in approaching testing in a methodologically sys­

tematic manner where planning and execution results impact

overall system performance. To the best of our knowledge this

is the first such system to target a constraint-based planning

and execution system.

Overall, the random testing approach works well for het­

erogeneous, real-world systems such as T-REX. The com­

bination of NDDL and C++ code makes effective end-to­

end white-box testing very hard: previous testing has instead

tended focus on running a handful of simulated missions.

Results to date therefore show extensive random testing seems

well suited to finding bugs in temporal planning software.

Our future effort involves analysis of execution traces

which would point to domain model errors. However this is

non-trivial. One reason is that the planner enforces a causal

chain of inter-connected tokens across and within timeline

boundaries; often an activity in the temporally distant past

has an impact on the current activity being executed. Es­

tablishing such causality (for causal explanations for mixed­

initiative systems in addition to testing) is an ongoing research

effort within the constraint-based planning community [23]. Another related idea is for the system to automatically re-run

failures, using various subsets of the sampled parameters, so

as to identify parameter(s) and root causes for the failures.

Finally, it could be useful to implement an adaptive sampler,

so that regions of parameter space that have produced failures

previously in a run could be explored more thoroughly.

A similar enhancement would be to implement a form of

adaptive random testing [24], [25], where new inputs are

sampled from regions of the input space furthest away from

those already tested.

ACKNOWLEDGMENTS

This research was supported by the David and Lucile

Packard Foundation. We thank our collaborators at MBARI

Tom OReilly, Hans Thomas, John Ryan, Thom Maughan,

Brent Roman, Rob McEwen, Rich Henthorn, Chris Scholin,

Bob Vrijenhoek, Larry Bird and Alana Sherman. We thank NASA Ames Research Center for making the EUROPA

Planner available, Willow Garage for supporting McGann's

collaboration, Juhan Emits at Birmingham for valuable com­

ments, and George Matsumoto of MBARI who coordinated

the summer internship program under which this work was

initiated.

REFERENCES

[I] J. Bellingham and J. Leonard, "Task Configuration with Layered Control," in IARP 2nd Workshop on Mobile Robots for Subsea En­vironments, May 1994.

[2] R. Brooks, "A Robust Layered Control System for a Mobile Robot," IEEE Journal of Robotics and Automation, vol. RA-2, pp. 14-23, 1986.

[3] C. McGann, F. Py, K. Rajan, H. Thomas, R. Henthorn, and R. McEwen, "A Deliberative Architecture for AUV Control," Proc. ICRA, 2008.

[4] C. McGann, F. Py, K. Rajan, J. Ryan, and R. Henthorn, "Adaptive Con­trol for Autonomous Underwater Vehicles;' in Proc. AAAI, Chicago, 2008.

[5] B. Smith, M. S. Feather, and N. Muscettola, "Challenges and Methods in Testing the Remote Agent Planner;' in Proc. AlPS, Toulouse, France, 2000, pp. 254-263.

[6] W. Visser, K. Havelund, G. Brat, and S. Park, "Model Checking Programs," Automated Software Engineering, Jan 2003.

Page 9: Randomized Testing for Robotic Plan Execution for ...nah/bibtex/papers/saigol2010ieeeauv.pdf · control architecture that allows plans to be generated and executed onboard. One challenge

17] T. A. Henzinger and J. Sifakis, "The Discipline of Embedded Systems Design," Computer, vol. 40, no. 10, pp. 32-40,2007.

[8] R. Hamlet, "Random Testing," in Encyclopedin of Software Enginur­ing. WIley, 1994, pp. 970-978.

[9] 1. Duran and S. Ntafos, "An Evaluation of Random Testing," IEEE Transactions on Software Engineering, vol. SE-lO, no. 4, pp. 438-444, July 1984.

[10] T. Y. Chen, T. H. Tse, and Y. T. Yu, "Proportional Sampling Strategy: A Compendium and Some Insights," Journal of Systems and Software, vol. 58, no. 1, pp. 65-81, Aug. 2001.

[11] A. Groce, G. Holzmann, and R. Joshi, ''Randomized Differential Test­ing as a Prelude to Formal Verification," in Proc. [CSE, Minneapolis, 2007.

[12] A. Groce and R. Joshi, ''Random Testing and Model Checking: Building a Common Framework for Nondeterministic Exploration," in Workshop on Dynamic Analysis (WODA) . Seattle, Washington, 2008, pp. 22-28.

[13] N. Muscettola, P. Nayak, B. Pell, and B. Williams, "Remote Agent: Th Boldly Go Where No AI System Has Gone Before," in Arlificial Intelligence, vol. 103, 1998.

[14] A. Jonsson, P. Morris, N. Muscettola, K. Rajan, and B. Smith, "Planning in Interplanetary Space: Theory and Practice," in Proc. AlPS, Breck­enridge, Colorado, 2000, pp. 177-86.

[15] M. Ai-Chang, J. Bresina, L. Charest, A. Chase, J. Hsu, A. Jonsson, B. Kanefsky, P. Morris, K. Rajan, J. Yglesias, B. Chafin, W. mas, and P. Maldague, "MAPGEN: Mixed-Initiative Planning and Scheduling for the Mars Exploration Rover Mission," IEEE Intelligent Systems, vol. 19, no. 1, pp. 8-12, 2004.

[16] J. BreBina, A. Jonsson, P. Morris, and K. Rajan, "Activity Planning for the Mars Exploration Rovers," in Proc. lCAPS, Monterey, California, 2005.

[17] J. Frank and A. Jonsson, "Constraint-based attribute and inteIVal planning," ConstrainJs, vol. 8, no. 4, pp. 339--364, Oct. 2003.

[18] R. Dechter, I. Mem, and J. Pearl, "Temporal Constraint Networks," Artificial Intelligence, vol. 49, no. 1-3, pp. 61 - 95, 05 1991.

[19] J. Allen, "Thwards a General Theory of Action and Time," Arlificial Intelligence, vol. 23(2), pp. 123-154, 1984.

[20] M. Gertler and G. Hagen, "Standard Equations of Motion for Submarine Simulation," Naval Ship Research and Development Center Report 2510, June 1967.

[21] M. Fox, D. Long, F. Py, K. Rajan, and J. P. Ryan, "In Situ Analysis for intelligent Control," in Proc. of IEEEIOES OCEANS Conference. IEEE,2007.

[22] S. B. Johnson, A. Sherman, R. Marin, J. Ryan, and R. C. Vrijenhoek, "Detection of Marine Larvae using the AUV Gulper and Bench-top SHA," in 8th Larval Biology Symposium, Lisbon, Portugal, July 2008.

[23] J. BreBina and P. Morris, ''Mission operations planning: Beyond map­gen," Space Mission Challenges for Information Technology (SMC-lT), Jan 2006.

[24] T. Chen, H. Leung, and I. Mak, ''Adaptive Random Testing," in Proceedings of the 9th Asian Computing Scwnce Conference, ASIAN 2004. Springer Berlin I Heidelberg, 2005, pp. 320-329.

[25] T. Chen, R. Merkel, P. Wong, and G. Eddy, "Adaptive Random Testing through Dynamic Partitioning," Proceedings of the Fourlh Intemational Conference on Quality Software, 2004 (QSIC '04), pp. 79-86, Sept. 2004.