GPU based generation of state transition models using simulations for unmanned surface vehicle...

Robotics and Autonomous Systems 60 (2012) 1457–1471

Contents lists available at SciVerse ScienceDirect

Robotics and Autonomous Systems

journal homepage: www.elsevier.com/locate/robot

GPU based generation of state transition models using simulations for unmannedsurface vehicle trajectory planningAtul Thakur a, Petr Svec a, Satyandra K. Gupta b,∗

a Department of Mechanical Engineering, University of Maryland, College Park, MD 20742, USAb Department of Mechanical Engineering and Institute for Systems Research, University of Maryland, College Park, MD 20742, USA

a r t i c l e i n f o

Article history:Received 7 October 2011Accepted 17 July 2012Available online 24 July 2012

Keywords:USVVehicle simulationFluid–rigid interactionGP-GPUTrajectory planningMDPState transition mapStochastic dynamic programmingMotion uncertainty

a b s t r a c t

This paper describes GPU based algorithms to compute state transition models for unmanned surfacevehicles (USVs) using 6 degree of freedom (DOF) dynamics simulations of vehicle–wave interaction. Astate transition model is a key component of the Markov Decision Process (MDP), which is a naturalframework to formulate the problem of trajectory planning undermotion uncertainty. The USV trajectoryplanning problem is characterized by the presence of large and somewhat stochastic forces due to oceanwaves, which can cause significant deviations in their motion. Feedback controllers are often employedto reject disturbances and get back on the desired trajectory. However, the motion uncertainty can besignificant and must be considered in the trajectory planning to avoid collisions with the surroundingobstacles. In case of USV missions, state transition probabilities need to be generated on-board, tocompute trajectory plans that can handle dynamically changing USV parameters and environment (e.g.,changing boat inertia tensor due to fuel consumption, variations in damping due to changes in waterdensity, variations in sea-state, etc.). The 6 DOF dynamics simulations reported in this paper are basedon potential flow theory. We also present a model simplification algorithm based on temporal coherenceand its GPU implementation to accelerate simulation computation performance. Using the techniquesdiscussed in this paper we were able to compute state transition probabilities in less than 10 min.Computed transition probabilities are subsequently used in a stochastic dynamic programming basedapproach to solve the MDP to obtain trajectory plan. Using this approach, we are able to generatedynamically feasible trajectories for USVs that exhibit safe behaviors in high sea-states in the vicinityof static obstacles.

© 2012 Elsevier B.V. All rights reserved.

1. Introduction

USVs operate in ocean environments with disturbances causedby waves, currents, the wake of other ships, etc. The disturbancesimpart significant uncertainty in the vehicle’s motion. Due to thepresence of high motion uncertainty, the action of a USV maynot lead to the exact desired motion despite using a feedbackcontroller. Vehicle dynamics and motion uncertainty togethermake the task of trajectory planning very challenging, particularlyin highly cluttered environments. Consider Fig. 1, which showsthe influence of vehicle dynamics and motion uncertainty on theplanned trajectories in a USV mission. The circles M , P , and Qdenote three consecutive waypoints of a mission. When the USVreaches P , it needs to find the optimum trajectory (with minimumlength and risk of collision) between P and Q . A way to solve this

∗ Corresponding author. Tel.: +1 301 405 5306.E-mail addresses: [email protected] (A. Thakur), [email protected] (P. Svec),

[email protected] (S.K. Gupta).

0921-8890/$ – see front matter© 2012 Elsevier B.V. All rights reserved.doi:10.1016/j.robot.2012.07.009

problemwould be to find the optimal trajectory without explicitlyconsidering the ocean disturbances (shown as the trajectory A).Disturbances may be insignificant in the case when the sea-stateis calm (e.g., sea-states 1 to 3) or the USV is heavy enough notto get deviated. If sea-state is very rough (e.g., sea-state 4 andhigher) or the USV is lighter, then the ocean disturbances maynot allow the vehicle to closely follow the intended trajectory (A)as the disturbances may lead to collisions with an obstacle in anarrow region. A trajectory,which is saferwith respect to the oceandisturbances but longer in length is shown in Fig. 1 as trajectory B.It is thus evident that the physics of interaction between the USVand ocean waves greatly influences the choice of trajectory plan.The variations take place in dynamical parameters of USV–oceaninteraction such as inertia tensor of USV due to fuel loss, dampingparameters due to change in water density, etc., during a mission.In addition to that the sea-state changes during the execution ofmission due to change in weather. Based upon dynamics basedinteraction between USV and ocean waves and changing sea-state,a suitable trajectory, which is safe and still not overly conservativeamong A and B, needs to be planned.

http://dx.doi.org/10.1016/j.robot.2012.07.009

http://www.elsevier.com/locate/robot

http://www.elsevier.com/locate/robot

mailto:[email protected]



http://dx.doi.org/10.1016/j.robot.2012.07.009

1458 A. Thakur et al. / Robotics and Autonomous Systems 60 (2012) 1457–1471

Fig. 1. Trajectory plans for different oceanwave disturbance conditions. TrajectoryA is shorter but riskier and may lead to collisions in the event of high sea-statewhereas B is longer but more conservative to minimize the risk of collision. Ifsea-state is calm (e.g., sea-states 1–3) or the USV is heavy enough so that thedisturbances are insignificant then trajectory A should be chosen. If sea-state isrough (e.g., sea-state 4 and higher) or the USV is light then trajectory B should bechosen.

The physics-aware trajectory planning problem in highlyuncertain ocean environments outlined above can be solved bycombining the MDP [1–3] framework and a dynamically feasiblemotion primitive based state space representation [4,5]. MDPis a natural framework to formulate the problem of trajectoryplanning under motion uncertainties [1,3]. The use of motionprimitives in an MDP based framework, thus, allows generatingtrajectories that explicitly consider the constraints imposed bythe vehicle dynamics. This is unlike planning on a rectangulargrid that might yield a dynamically infeasible trajectory. In thispaper we incorporate vehicle dynamics and motion uncertaintyinto motion-primitives using Monte Carlo runs of high-fidelity6 DOF dynamics simulation of interaction between a USV andocean waves. We represent uncertainty in motion primitives byusing a state transition function. This function maps each possiblediscretized state of the vehicle and a given action to a list of possibleresulting states with respective transition probabilities.

The state transition function can be obtained by either runningfield experiments or by using computer simulations. Performingfield experiments is the most accurate method, but is expensiveand may be infeasible when needed to be performed duringa mission. Moreover, the number of experiments may be verylarge when multiple sea-states and vehicle dynamics parametersneed to be taken into account, which is the case during longmissions. Computer simulations are inexpensive and can beimproved, by incorporating experimental data. The complexityof a mission and an environment require generating the statetransition map on-line, based on the information gathered bythe sensors during vehicle operation. It may be infeasible torun all possible simulations off-line (i.e. before the mission) togenerate the state transition map. This is because, the sea-stateis decided by several factors, namely, the amplitudes, frequencies,wave directions, and the wave numbers of the wave componentsforming the ocean wave [6,7]. Each of these factors is continuousand has a non-linear influence on the ocean wave. In additionto this, the ocean interacts with USVs in a non-linear fashion.The initial conditions for the simulations can thus, becomecombinatorially prohibiting for an off-line estimation of a statetransition map. A potential flow theory based high fidelity 6DOF dynamics simulator can generate a fairly accurate statetransition map and aid in generating both safe and low costtrajectories [8–10]. The major problem with using such a highfidelity simulator is the slow computational performance of thesimulation. This is mainly because of the fluid to rigid bodyinteraction computation. One way to accelerate the computationsis by using parallelization. Performing parallel computationswould require computing clusters to be placed on the base and

communicating with the vehicle over a network, which might begenerally unreliable due to possible communication disturbances.The alternative of placing clusters directly on-board might add tothe weight of the payload, which may not be desired or generallynot possible. Another way to perform on-board computing is byusing GPUs. Recently, expensive scientific computations in variousrobotics problems are performed using GPUs, which are verypowerful and lightweight as well [11–14]. In addition to using GPUcomputing, we also employ model simplification techniques [9] tofurther improve computational performance of the simulation. Afaster computation of state transition maps may be required forsome parts of the mission, where computing speed-up availableusing GPU alone may not be enough. The accelerated dynamicssimulation is then used to establish connectivity among thevehicle’s discretized states to develop a state transition maprepresented using a connectivity graph. The trajectory planningproblem is then solved using value iteration of the stochasticdynamic programming (SDP) [1].

The key contributions of this paper are: (1) incorporation of6 DOF dynamics simulation in state transition map estimation,(2) use of GPU based parallelization schemes to make the sim-ulation faster for on-line computation of state transition map,(3) incorporation of temporal coherence based model simplifica-tion techniques with GPU acceleration for computing state tran-sition maps even faster, and (4) solution of trajectory planningproblem using developed state transition map data structure, sothat the computed trajectory plan satisfies the dynamics con-straints as well as handles motion uncertainties.

2. Literature review

In this paper we focus on the following issues related tothe trajectory planning problem of the USVs under motionuncertainty, namely, (i) simulation of the USVs, and (ii) physics-aware trajectory planning algorithm. We assume that the state ofthe vehicle can be estimated perfectly at all times. We discuss therelated work in both the areas in this section.

2.1. Fluid–rigid body interaction simulation

Fluid–rigid body interaction simulations are computationallyexpensive because of the coupling between the fluid flow andthe rigid body motion namely (1) influence of rigid body motionon the flow of fluid in which it moves, and (2) the influence ofthe fluid motion on the rigid body motion. A simulation approachin which both the couplings are considered explicitly in eachtime step are called a two-way coupling solution; whereas, ifone of the couplings is replaced with some faster model then itis called a one-way coupling solution. In this section we shallreview some common representative techniques for fluid–rigidbody interaction simulation. Two-way coupling based approachesnot necessarily developed for vehicle and ocean interactionsimulation are: Euler’s momentum equation, Navier–Stokes’ law,the Smoothed Particle Hydrodynamics (SPH) technique, and theLattice Boltzmann Method (LBM). In Euler’s momentum equationbased technique, the momentum equation is solved for fluidsnumerically. Batty et al. reported a computation time of 25 s perframe using a grid size of 60 × 90 [15]. Carlson et al. reported anumerical solution of Navier–Stokes equations with computationtime of 27.5 s per frame for a domain of size 64 × 64 [16].In the SPH technique, the fluid is assumed as a collection ofparticles and the motion of fluid particles and their effects ona floating rigid body is modeled based on a kernel functionweighted by the distance of the particle from the floating rigidbody. Becker et al. reported a computation time of 3.47 s persimulation step for simulating fluid with 850,000 particles [17]. In

A. Thakur et al. / Robotics and Autonomous Systems 60 (2012) 1457–1471 1459

LBM, fluid flow is represented as motion of fluid particles, whereeach particle follows a velocity distribution function and movesin discrete time steps and can collide with other particles (whichbehave in the same way). The collision rules are such that thestatistical particle motion (or fluid flow) obtained is consistentwith the continuity conditions. Garcia et al. developed LBM basedfluid–structure interaction approach [18]. Geist et al. developeda real-time approach for wave surface generation and attained25 Hz for grid of size 10242 [19]. Geveler et al. developed LBMbased approach for simulating laminar flow with free fluid surfaceon multi-core CPU and many-core GPU processors and reported afactor of 8 speed-up on GPU code with respect to multi-threadedCPU code [20]. Gladkov et al. reported direct simulation MonteCarlo (DSMC) for solving Boltzmann equations on per particle basis(to solve rarefied fluid flow problems) by using GPUs to obtainspeed-up by a factor of 65 over serial implementation.

A survey on boat simulation was reported by Beck andReed [21]. Craighead et al. reported another recent survey onopen source boat simulators [22]. Some of the key USV simulationtechniques are RANS based techniques, strip theory basedtechniques, kinematic model based techniques, and potential flowtheory based techniques. In recent years, RANS based techniquesfor fluid flow around boats for simulating boat motion arebecoming popular because of their accuracy in the problemsinvolving boundary layer effects, turbulence, wake, etc., [23].Kim [24] reported computation time of 24 h using 84 processorson Mauis IBM-SP3 computer for 360 simulation time steps. Someof the other implementations of RANS code can be found in[25–27]. There are many research papers reported in the area ofthe underactuated controller design for the USVs that utilize 3 DOFsimplified models which neglect the rolling, pitching, and heavingmotions [28–33]. Strip theory is mainly used for slowly movingslender geometries [34–36]. In potential flow theory, fluid flow isassumed to be irrotational and inviscid [37]. Potential flow theorybased techniques are used by several researchers to perform themotion simulation of USVs [8–10]. Thakur and Gupta reportedreal time computational performance of USV simulation usingclustering, temporal coherence, and multi-core parallelizationbased model simplification techniques in potential flow theorybased simulation framework [9].

In nutshell:

• Euler’s equation, Navier–Stokes, andRANS equation based tech-niques yield highly accurate results but one of their limitationsis the dependence of the computation time on the domain sizeand the slow speed of computation.

• The SPH technique results in good quality animations but theproblem with the approach is the requirement of a large num-ber of particles to simulate the fluid which in turn increases thecomputational time.

• The 3 DOF simulations are computationally very fast since theyneglect the effect of fluid flow on the roll, pitch, and yaw and asa result are not accurate enough.

• Strip theory based techniques are computationally very fast butare not suitable for taking into account the variations in hull ge-ometry and wave interactions. This is because, in strip theory,the hull geometry is approximated to the nearest ideal shape(such as ellipsoids, spheres, etc.). This idealization might yieldsignificant errors in hydrodynamic and hydrostatic force esti-mations.

• The accuracy obtained by the potential flow based techniqueare not as good as RANS but are computationally faster andmuch easily amenable to the 6 DOF computations and hence,much more accurate compared to the simplified 3 DOF mod-els. We thus, use the potential flow theory based model in thispaper.

2.2. Physics-aware trajectory planning

There is a large body of literature [1] in robot motion planningand we are presenting here only a review of representativeresearch papers.We shallmainly focus on research related to robottrajectory planning considering differential constraints or in otherwords physics-aware robot trajectory planning. The literaturefor trajectory planning under differential constraints falls intothe following categories [38] namely, (1) state space samplingbased trajectory planning, (2) decoupled trajectory planningwith minimum distance path, (3) finite-state motion model ormaneuver automaton (MA), (4) mathematical programming, and(5) model predictive control (MPC). In state space samplingtechniques, the robot state space is discretized and then searchedfor the low cost collision free trajectory. Several schemes forstate space discretization have been reported. In a simple gridbased approach, the state space is discretized into regular cellsand a trajectory is searched in that space [39]. In navigationfunction based approach, a navigation function is defined overthe discretized state space and determined using algorithms suchas value iteration followed by determination of a trajectory.Interpolation is used [40] tomake theplanningdomain continuous.In rapidly exploring random trees (RRT), a biased stochastic searchis performed in configuration space to generate a search tree[41–43]. In the decoupled approach, planning is executed in twophases. In the first phase, a discrete path or set of waypoints [44]is determined using a graph search technique such as A* [45,46]on a discretized representation of the state space by consideringonly kinematic constraints of the robot. In the second phase, adynamically feasible trajectory is determined by solving a two-point boundary value problem between consecutive waypointsusing gradient based optimization approaches. Suzuki et al. usedan A* based approach for waypoint generation and an RTABUsearch based optimization approach for trajectory generation [45].Scherer et al. reported an evidence grid with a Laplacian-based potential method for path planning, an obstacle avoidancebased on reactive planning, and velocity controller for trajectorygeneration [47]. In MA, the action space is discretized into actionautomatons to reduce the search from infinite dimensional spaceto finite dimensional space. Frazzoli et al. present a rigorousdefinition of MA in [48]. Some other related works can be foundin Refs. [4,49,50]. In the mathematical programming approach, thetrajectory planning problem is posed as a numerical optimizationproblem with the robot dynamics as constraints and solved usingtechniques such as mixed integer linear programming, nonlinearprogramming, and other constrained optimization techniques[51–54]. In model predictive control, the trajectory planningproblem is posed again as an optimization problem, but optimizedover a finite horizon. This way the solution obtained is suboptimalbut takes lesser computation time than optimizing over an infinitehorizon [55–57].

Under motion uncertainty, the MDP framework is used toexpress the trajectory planning problem and solved using dynamicprogramming (DP) algorithms [58]. However, since the state spaceof a planning problem under motion uncertainty is usually verylarge, most of the practical algorithms have been developed[59–61] to compute an optimal or close-to-optimal solution to theproblem by running value iteration over a carefully chosen subsetof the state space.

In the USV trajectory planning domain a three layered architec-ture forDijkstra algorithmbased global planning andA* based localplanning is presented by Casalino et al. [62]. They used a simplekinematic model with no environmental disturbances. Benjaminet al. developed a technique for collision avoidance and navigationof marine vehicles respecting the rules of roads [63]. Soltan et al.developed a nonlinear sliding mode control based trajectory plan-ner for a 3 DOF dynamics model [64]. Xu et al. reported a receding


horizon control based trajectory replanning approach where theglobal plan is determined using a predetermined level sets fromexperimental runs [65]. Autonomous guidance based on feedbackcontrol is developed by Sandler et al. [66].

In nutshell:• Most of the trajectory planning algorithms described above as-

sumedeterministic environmental conditions or conservativelyapproximated uncertainties. A conservative approximation ofmotion uncertainty due to environmental disturbances inter-acting with vehicle dynamics might lead to sub-optimal plans.

• In order to incorporate motion uncertainty into the trajectoryplanning problem, anMDPbased framework is often used. Statetransition probability encodes the vehicle dynamics and envi-ronmental disturbance information into an MDP formulation.

• In order to do physics-aware trajectory planning, one way toincorporate the physics information into the problem formula-tion is to use Maneuver Automatons or motion primitives andemploy simulations to estimate state transition probabilities.

One of the challenges in incorporating a simulation based statetransition map is slower computational speed. Dynamics simula-tion based state transitionmap estimation is computationally slowdue to the fluid–rigid body interaction computations. This can bealleviated using model simplification techniques. In this paper, wefocus on trajectory planning based on the Maneuver Automatonframework from which we use only maneuvers, not trims, similarto the lattice based planning in Ref. [4]. We extend this frameworkto consider the motion uncertainty due to ocean waves using theMDP framework. A similar approach for a trajectory planning al-gorithm for unmanned balloons under the influence of stochasticwinds (by estimating transition probabilities) has been developedby Wolf et al. [67].

3. Problem statement and solution approach

3.1. Problem statement

Given,(i) a finite non-empty state space X ,(ii) a finite non-empty action space u(x) for each state x ∈ X ,(iii) a dynamics motion model x = f (x, u, w) of the USV, where

w is a nondeterministic noise term and fluid flow is based onpotential flow theory,

(iv) goal state xG, and(v) obstacle mapΩ such that,

Ω(x) = 1, if x lies on obstacle= 0, if x is on free space,

compute the following:(i) State transition map over X and u: The state transition map

should represent the motion uncertainty exhibited by thegiven motion model under each given action in u in the formof associated probability of transition for the correspondingstate transition probability p(xk|xi, ui), ∀xk, xi ∈ X, and ui ∈

u. Perform GPU based computing acceleration and developmodel simplification techniques for on-line estimation of statetransition maps.

(ii) Trajectory plan: Using the state transition map computed instep (i), determine a trajectory plan to generate a dynamicallyfeasible trajectory in each planning cycle to reach targetlocation xG from any given starting location of the USV.The computed trajectory plan ensures that the generatedtrajectory is updated in every planning cycle to recover fromthe pose errors introduced due to the influence of the oceanenvironment. This kind of trajectory plan is also referred to asa feedback plan in Chapter 8 of Ref. [1].We assume perfect stateinformation is available at all times.

Fig. 2. Description of coordinate systems used in the presentedmodel: Inertial andbody coordinate systems are shown.

3.2. Approach

The approach is enumerated in the following steps.

(i) Enhance the givenUSVmotionmodel to suit the requirementsfor GPU implementation. Implement the motion model on aGPU and develop simplification algorithms to enable fastersimulation.

(ii) Model the trajectory planning problem as MDP by represent-ing state–action space in a lattice data structure and computethe state transition map for the discretized action space.

(iii) Apply value iteration of stochastic dynamic programming todetermine the trajectory plan. The generated trajectory planenables the USV to find the optimal trajectory from eachdiscretized state x ∈ X .

In the following sections, we discuss the above steps in detail.

4. Dynamics simulation of USVs

In this section we present the governing equations of the im-plemented dynamics model [9]. We extend the equations to han-dle the arbitrary number of wave components and to incorporateuncertainty into the system.

4.1. Motion equations: Interaction between USVs and ocean waves

We implemented the 6 DOF dynamics model for vehicles givenby Fossen [7]. In this model, a vehicle is assumed to be a rigidbody. The coordinate system used in the model is shown in Fig. 2.The origin of the inertial frame of reference is set at the nominalwater level with the Z-axis being vertical and pointing upwards.The body coordinate system for representing the hull geometryand the velocity directions of the USV is attached to boat’s centerof gravity (CG).

The governing dynamics equation of a boat’s motion in oceanwaves is given in Eq. (1).

MH v + CH(v)v + DH(v)v + g(p) = FE + FPp = Jp(v)

(1)

where, p = [x, y, z, θx, θy, θz]T is a pose vector expressed inthe inertial frame, [x, y, z]T is the Cartesian position vector inm and θ ’s are Euler angles about subscript axes in rad, v =

[vx, vy, vz, αx, αy, αz]T is the velocity vector expressed in the body

frame relative to the inertial frame, vt = [vx, vy, vz]T is the linear

velocity in ms−1 and vr = [αx, αy, αz]T ’s is the angular velocity in

rad s−1,

R =

cycz sxsycz − cxsz cxsycz + sxszcysz sxsysz + cxcz cxsysz − sxcz−sy sxcy cxcy

is rotation matrix rotating a vector expressed in the body frame to


the inertial frame, cx means cos θx, J =

R 03×3

03×3 Jr

is a Jacobian

matrix,

Jr =

1 sxty cxty0 cx −sx0

sxcy

cxcy

,x× ≡ S(x) =

0 −x3 x2x3 0 −x1

−x2 x1 0

is a matrix dual (for cross product) of vector x = [x1, x2, x3]T ,

pG,B is a vector representing the position of CG in the body frameof reference,

m is the mass of the USV in kg,Ib is 3 × 3 matrix representing the inertia tensor of the USV in

kg m2, MRB =

mI3×3 −mS(pG,B)

mS(pG,B) Ib

is a matrix representing the

inertia tensor of the USV,MA is a (6× 6) diagonal matrix representing the added mass of

the USV,

MA,11 = 0.1 m

MA,22 = 4.75ρa2

MA,33 = 4.75ρa2

MA,44 = 4.75ρa2

MA,55 = 0.396ρa2L2x + 0.08330.1 mLX

L3z

MA,66 = 0.08330.1 mLX

L3y + 0.396ρa2L3x ,

where ρ is the density of water in kg m−3, Lx, Ly, and Lz are length,width and height of the bounding box of the hull in m respectively,

MH = MRB+MA is a (6×6)matrix representing the total inertia,

CR,B =

mS(vr)3×3 −mS(vr)3×3S(pG,B)

mS(vr)3×3S(pG,B)3×3 −S(Ibvr)

is the Coriolis and centripetal matrix,

CA =

03×3 −S(MA,11vt + MA,12vr)

−S(MA,11vt + MA,12vr) −S(MA,21vt + MA,22vr)

is the matrix representing the effect of added mass,

MA,ij represents the (i, j) sub-matrix ofMA of size 3 × 3,CH = CRB + CA is a 6 × 6 matrix representing the total effect of

the Coriolis and added mass term,DH is a 6 × 6 damping matrix,g(p) is a 6×1 vector representing the restoring force expressed

in the body frame in N ,FE is a 6 × 1 vector representing the environment force vector

expressed in the body frame in N ,FW is a 6×1 vector representing the oceanwave and the vehicle

interaction force expressed in the body frame in N , andm is the mass of the boat in kg,FP is a 6× 1 actuation force vector expressed in the body frame

in N .The fluid flow is computed based on the potential flow

theory [37]. Based on the potential flow theory, the ocean waveis represented using a spatio-temporally varying height field ηcomposed of Q wave components and given by the followingequation.

η(x, y, t) =

Qj=1

Aj cos(kjx cos θw,j + kjy sin θw,j − ωjt + ψj)

+ 0.5A2j kj cos(2kjx cos θw,j + 2kjy sin θw,j

− 2ωjt + 2ψj), (2)

where, Aj, ωj, kj, θw,j are the amplitude, frequency, wave number,and wave direction respectively for jth wave component, andψj ∈

[0, 2π) uniformly random phase lag term.The velocity potential φ is computed using the following

equation.

φ =

Qj=1

gAj

ωjexp(kjz) sin(kjx cos θw,j

+ ky sin θw,j − ωjt + ψj). (3)

The velocity potential is then used to compute force acting onUSV due to ocean wave using the following equation.

FW =

ρ

SB

∂φ

∂t+ 0.5∇φ · ∇φ

dS

ρ

SB

∂φ

∂t+ 0.5∇φ · ∇φ

(r × dS)

, (4)

where SB is instantaneous wet region of the USV. We assume thatforce acting on boat is only due to ocean waves (FE = FW ) andignore forces due to currents, wakes, etc. However, given a suitablemodel, a simulation framework is capable of taking other types offorces into account.

The term FP on the right hand side of Eq. (1) is the force dueto the actuation (i.e., the thrust and the rudder angle). There aremany actuator models available for USVs and can be plugged intoEq. (1) [7,8]. We used a model given in Eq. (5).

FP =K1 ∥ nprop ∥ nprop, 0, 0, 0, 0, K2K1 ∥ nprop ∥ npropθrud

T, (5)

where K1 is a constant and we chose it to be 1000, nprop is thepropeller’s rpm, K2 is a constant and we chose it to be 10, and θrudis the rudder angle.

We can express the parametric form [1] of the 6 DOF model asfollows:

x = f (x, u), (6)

where x =pT , vT

T is the state of the USV,

u =vf ,Θ

T is the commanded control action to go withforward velocity of vf at a heading angle ofΘ , and

f =

M−1

H (−CH(v)v − DH(v)v − g(p)+ FE + FP)T, (Jv)T

T.

We specify the desired control action using the desired forwardvelocity and the heading angle. Any other kind of desired controlactions such as the desired angular velocity can also be chosenbased on the available actuation models. For the purpose of thispaper we assume that the USVs are controllable using the controlactions specified by vf andΘ .

The thrust nprop and the rudder angle θrud can be computed usinga PID controller.

We chose a PID controller because of its widespread use andease of implementation, however, in order to execute the com-manded control actions any other controller such as the backstep-ping [68], sliding mode, etc. can be used [28].

4.2. Uncertainty in the motion model

In Section 4.1, the ocean wave (the ocean wave height andthe velocity field) was initialized using given wave amplitudes,frequencies, and directions. An ocean wave, once initializedevolves deterministically with time and can be predicted exactlyusing the solution of the Laplace equation (see Eq. (2)) [37].However, the ocean waves initialized with the same parametersmight look different due to the presence of the uniformly randomphase lags ψj between each wave component. This leads to


(a) Sample trajectories. (b) Frequency distribution of the USV’s ending positions forsample trajectories.

(c) Frequency distribution of the USV’s ending orientations forsample trajectories.

Fig. 3. Uncertainty in a USV’s motion model for action along the X axis generated using a sample size of 256 (computed for sea-state 4 with average ocean wave height of1.8 m and boat moving at the velocity of 3 ms−1).

prediction of slightly different trajectories in each simulation runof theUSVs operating under the oceanwaveswith exactly identicalocean wave parameters (initialized with uniform random phaselags) and an action. This effect is shown in Fig. 3(a), in which theUSV is acted upon by a PID controller to move along a straightline for 256 different simulation runs in an ocean wave built upof identical wave components. The variation in the trajectories oftheUSV in each simulation run is due to the uncertainty introducedby random phase lags (see Eq. (2)) in the ocean wave componentsdespite the other ocean parameters and the PID control objectivebeing exactly identical. Fig. 3(b) shows the histogram of finalpositions reached by the USVwhen commanded to reach at (30, 0)with an orientation of 00 due to the disturbances caused by theocean waves. Fig. 3(c) shows the variation in the final orientationwhile the commanded action was 00. It should be noted that thevariation in the ending pose is one of themain cause of randomnessin the motion model, which accumulates over the trajectory.

Formally, we can express the parametric form of the dynamicsmodel (with uniformly random initial phase lag parameters) asfollows:

x = f (x, u, w), (7)

wherew is the noise introduced due to the uniform random phaselags ψj.

4.3. Simulator implementation on GPU

As described in Section 4.2, and shown in Fig. 3, the USV ends upin different poses for exactly identical action objectives and initialstates for different phase lag initializations. This means that for a

sufficiently large number of simulation runs with uniform phaselag initializations for a given set of initial conditions and actiongoals, the distribution of final states will represent the influenceof the ocean on the USV’s motion. In this section, we describe aMonte Carlo simulation based approach to estimate the influenceof nondeterministic effects of the ocean on the USV’s motion for agiven set of action goals. TheMonte Carlo sampling based approachrequires running numerous dynamics simulations with uniformlyrandom initial phase lags among wave components and hence,real-time performance of the simulator may not be enough [9].We describe the enhancements made in the potential flow theorybased simulation model [8–10] for suitability of implementationon GPU.

We define state vector tuple X = (x1, x2, . . . , xM), where xk =

[vTk , xTk ]

T is the state vector of the kth simulation run and M is thenumber of Monte Carlo simulation runs.

Let U = (u1, u2, . . . , uM) be the action vector tuple witheach element as a vector specifying action for the correspondingsimulation run.We denote an action set for the entire sample usingthe tupleΥ = (U1,U2, . . . ,UP)where Uj’s are action vector tuplesand P is the number of action goals.

Also, let W = (w1, w2, . . . , wM) be the phase lag vector tuplewhere wk = [ψ1,k, ψ2,k, . . . , ψQ ,k]

T is the phase lag vector forthe kth simulation run and ψq,k is the phase lag of the qth wavecomponent of the kth simulation run.

Thus, the augmented dynamics equation can be written bygeneralizing Eq. (7) forM Monte Carlo simulation runs as follows:

X = F(X,U,W ), (8)

where F is the modified dynamics function representing simulta-neous simulation runs.


Fig. 4. Implementation of USV simulator on GPU.

Let FWT = (F TW ,1, F

TW ,2, . . . , F

TW ,M) be the oceanwave force tuple

where FW ,k is the ocean wave force vector for the kth simulationrun.

The computation steps of the simulation are enumerated below(see Fig. 4).

Algorithm 1. GPU based Monte Carlo Simulation of USV’s dynam-icsInput

(a) Initial state vector tuple of USV X0,(b) number of Monte Carlo runsM ,(c) desired target state vector tuple Xt ,(d) desired trajectory length l,(e) radius of acceptance r ,(f) number of ocean wave components Q ,(g) time step size∆t ,(h) action set Υ ,(i) polygonal geometry of USV, and(j) sets of amplitudes Aq, frequenciesωq, directions θq correspond-

ing to each wave component, where index q varies from 0 toQ − 1.

Output Set ofM trajectoriesSteps

(i) Initialize the state vector tuple of USV X = X0, phase lagtuple W , time t = 0, and trajectory length vector L =

[0, 0, . . . , 0]T . Copy all configuration variables such as oceanwave parameters, dynamics parameters and geometry pa-rameters to a constantmemory cache of the GPU so that datais not required to be transferred in each simulation time step.

(ii) Transform the USV geometry to M states represented by X .Each transformation is performed by a separate GPU thread.In this case, the same instruction of transformation needsto operate on MN similar data, where N is the number ofpolygonal facets representing the USV’s geometry. We per-form computations of this step on the GPU.

(iii) Determine the instantaneous wet surfaces (SB,j) of the USVby finding out the facets lying beneath and on the wave sur-face (computed by superimposing given Q ocean wave com-ponents using Eq. (2)) corresponding to jth phase lag vectorwj and use Eq. (4) to compute the wave force tuple FWT . Inthis case computation of the intersection of each polygonalfacet with instantaneous ocean wave and force computationis performed by separate GPU threads. The number of inde-pendent operations required is again MN . We perform com-putations of this step on the GPU.

(iv) Determine the required control force vector tuple corre-sponding to the action set Υ using Eq. (5).

(v) Determine the Coriolis matrix Ck(v) and the damping matrixDk(v) corresponding to eachMonte Carlo simulation run. Thenumber of independent operations required in this step isM .We perform computations of this step on the GPU.

(vi) Use Euler integration to solve Eq. (8) by using the wave forcetuple, Coriolis matrix, and damping matrix. Update time t tot + ∆t . The number of operations needed in this step is M .We perform computations of this step on the GPU.

(vii) Find Euclidean distance ∆X between state tuple obtainedfrom step (vi) and X and update trajectory length vector Lwith L + ∆X . Compare each element of L with the desiredtrajectory length l. The Monte Carlo runs, for which trajec-tory length exceeds the set trajectory length l, do not updatecorresponding elements of X whereas for other runs updatethe elements in X with the solution found in step (vi). Sincethis is a stepwith logical branchingwe perform it on the CPU.

(viii) If trajectory lengths for all the runs exceeds l or all simulatedinstances of USV are within the radius of acceptance r fromthe respective target positions then returnM trajectories elsego to step (ii).

4.4. Simulation results of GPU based parallelization

We used NVIDIA’s CUDA software development kit version3.2 with the Microsoft Visual Studio 2008 software developmentplatform on Microsoft Windows 7 operating system for theimplementation of Algorithm 1. The graphics hardware used wasNVIDIA GeForce GT 540M mounted on Dell XPS with Intel(R)CoreTM i7-2620M CPU with 2.7 GHz speed and 4 GB RAM. Wechose the number of CUDA threads per block to be 256 for eachkernel function. For the CUDA kernel function needed to work onJ data members, we chose the number of computing blocks to beJ+T−1

T . The number of triangular facets in the USV model used inthe simulations was 11158 and the bounding box dimensions ofthe model was 12 × 4 × 4 m. We chose an ocean wave composedof Q = 20 components with 6 components having an amplitudeof 0.2 m while the remaining 14 had an amplitude of 0.1 m. 16 ofthe ocean wave components had a frequency of 1 Hz, while four ofthem had a frequency of 2 Hz, and the direction θw ’s were evenlydistributed in the range 0 to 2π rad.We chose simulation time stepof size 0.05 s and ran the simulation for 200 time steps for 256random phase lag initializations. Table 1 shows the comparison ofthe computational performance on GPU as compared to the CPUbased computation. OpenMP based multi-threading enabled 85%average CPU usage, while running the baseline simulations. TheGPU based approach resulted in a speed-up by a factor rangingfrom 3.8 to 14.0, for the presented test case. Table 1 also showsthat the speed-up factor increases with an increase in the numberof Monte Carlo runs, because of the highly data parallel natureof the computations. It should be noted that we exploit parallelnature of the problemboth over themodel facets aswell as over theMonte Carlo runs. For a larger number of simulation runs, the costof memory transfer is appropriated and hence, speed-up is largercompared to a smaller number of runs.

4.5. Model simplification on GPU

More than 99% of the computation time in the USV simulationis spent in computing the forces acting on the USV due to the oceanwaves [9]. The ocean is represented as a spatio-temporally varyingheightfield in this paper. One of themajor factors influencingwaveforces is the variation in the wave heightfield in addition to thefluid velocity around the USV. The oceanwave heightfield does not


Table 1Comparison of the computation gain due to GPU over the baseline computationsperformed on the CPU.

M Baseline computation timeon CPU (s)

GPU computationtime (s)

Speed-up

1 13.7 3.6 3.82 27.1 5.3 5.24 54.2 8.0 6.88 107.4 13.0 8.3

16 214.4 22.0 9.832 425.6 36.9 11.564 846.6 65.4 12.9

128 1691.2 123.8 13.7256 3386.3 241.9 14.0

change significantly with each simulation time step. For example,for a simulation time step of length 0.05 s, the possibility of oceanwave heightfield around the USV changing significantly is verylow. In such a situation, one can utilize the force computed in theprevious time step in the current time step of the simulation tosave some computational effort. This is the underlying idea behindtemporal coherence. In order to explain the idea of temporalcoherence in a more concrete way, we define the instantaneousocean heightfield and heightfield distance vector as follows.

Definition 1. Let the ocean wave be specified by Q componentsof given amplitudes, frequencies, and directions. Let state vectortuple X denote the instantaneous states of the USV for each MonteCarlo simulation run, Bj be the bounding boxes of the USV locatedat the poses given by X and rectangles RB,j be the projections of Bjon the XY plane. LetΛj denote uniform grid of sizem × n on RB,j.

Wedefine instantaneous oceanwaveheightfieldG as a (Q×mn)sized matrix G, such that the rows of G are the vectors made up ofordered elevations of the ocean wave at themn grid points onΛj.

Definition 2. For a pair of ocean wave heightfields G1 and G2, wedefine the heightfield distance vector hd between G1 and G2 as thefollowing row-wise second order norm.

hd = ∥G1,j − G2,j∥, (9)

where G1,j is the jth row of G1, and index j denotesMonte Carlo runfrom 1 to M .

The force need not be computed in a simulation time step,if the ocean wave heightfield distance around the USV from theprevious simulation time step is not significant. The temporalcoherence test is performed as an additional operation in step (ii) ofAlgorithm 1 (described in Section 4.3). If it is found that the oceanwave heightfield distance corresponding to at least one MonteCarlo run has changed significantly then step (iii) of Algorithm 1is performed, else step (iii) is skipped and step (iv) is directlyexecuted. By this, the execution of step (iii) in Algorithm 1 isavoided sometimes, which introduces some simplification error,but reduces computation time.

The steps for performing temporal coherence based modelsimplification on the GPU are described below.

Algorithm 2. Temporal coherence based Model SimplificationInput(a) State vector tuple X ,(b) number of Monte Carlo runsM ,(c) number of rows (m) and columns (n) of grid,(d) simulation time t and time step size∆t ,(e) threshold τ for heightfield, and(f) threshold dτ for differential of heightfield.

Output Decision about whether to perform force computation inthe next time step or reuse force computed in the earlier time step

Steps

(i) If t = 0 return decision to perform force computation.(ii) If t = ∆t then initialize heightfield Gp and differential height-

field dGp to a null matrix of size M × mn and store in globalmemory.

(iii) Compute oceanwave heightfieldG at time t and then computedifferential heightfield dG = G − Gp.

(iv) Compute heightfield distance hd vector between G and Gp.(v) Compute differential heightfield distance dhd between dG and

dGp.(vi) If all the elements of hd are less than τ and if all the elements of

dhd are less than dτ , return the decision to reuse the previousvalue of force else update Gp = G and dGp = dG and returnthe decision to recompute force.

4.6. Results of model simplification on GPU

We chose m = 2, n = 5,M = 256, and dτ = 0.1,and performed the simulations under identical ocean and USVdynamics parameter settings and varied τ from 0.00 to 0.10in the increments of 0.025. The computation speed-up factorover the GPU baseline computation time (when τ = 0.00)varies from 1.04 to 3.61 depending on the set threshold τ asshown in Fig. 5(a). The mean square error in force computationintroduced due to increasing threshold τ is shown in Fig. 5(b).The temporal coherence based simplification introduces errorsin the computation of the final pose of the USV in Monte Carloruns. Fig. 5(c) shows the variation in the Euclidean distanceof final positions of the USV from the nominal position andthe difference of each final USV orientation from the nominalorientation obtained by the Monte Carlo simulation runs. Thenominal pose is the commanded pose to which the USV shouldreach if there are no disturbances. In the test case, the nominalposition is (30, 0) and the nominal orientation is 0 rad. Thevariation in distance and orientation difference increases withincreasing threshold. This is because, greater the threshold, fewerthe number of times force is computed and hence, morewill be theinaccuracy.

We chose fixed randomization of the oceanwave for evaluatingthe plots, in order to prevent the influence of randomization on thecomputing time and the variation of the pose errors.

It should be noted in Fig. 5, that the computing gains increaseslowly in the range 0 < τ < 0.025, because, for smaller threshold,the algorithm is unable to reuse force values computed in theprevious steps and owing to the same reason, the variation in thefinal pose and nominal pose is also comparatively less pronounced.For larger values of threshold τ > 0.075, the variation in thedistance and orientation increases rapidly. It can thus be concludedthat τ can be varied in the window of 0.025 < τ < 0.075, forobtaining computational gain at the expense of acceptable errors.

Fig. 6 compares the computational gains obtained using tem-poral coherence on the GPU and CPU based simplification [9] ap-proaches with the baseline computed on the CPU (see Table 1).The main observations and respective analysis from the Fig. 6 areexplained below.

(i) The computational speed-up factor obtained using temporalcoherence on the GPU is in the range of 4.7 to 43.1 andincreases with the number of Monte Carlo runs. The increasein the speed-up can be attributed to the fact that the maincomputational cost of GPU operations is the data transferbetween GPU and CPU. In the USV simulation, the data ofstate variables and the systemmatrices need to be transferredfrom GPU to CPU which is a constant time operation. Whenthe number of runs is less, the appropriated compute time islarger whereas for large number of Monte Carlo runs the cost


(a) Computing speed-up over GPU baseline vs threshold. (b) Variation of mean square force error with threshold.

(c) Variation of distance from nominal point vs threshold. (d) Variation of orientation from nominal point vs threshold.

Fig. 5. Model simplification results of GPU based temporal coherence.

Fig. 6. Results of GPU acceleration.

of memory transfer reduces and hence, the speed-up factorincreases.

(ii) The computational speed-up factor due to temporal coherenceover GPU baseline ranges from 1.2 to 3.1.

(iii) The error associated with the model simplification performedon GPU and CPU is computed by taking themean squared per-centage error between the time series of the computed forces

using the simplified method and the baseline method [9].Themodel simplification based on temporal coherence imple-mented on the GPU led to an error of 1.04% for the thresholdτ = 0.075 and dτ = 0.1. Also, the figure shows variationof computational speed-up using model simplification algo-rithms based on clustering and temporal coherence on CPU [9]for simplification parameters C = 60, τ = 0.070, and dτ =

0.1. The approximation parameters C, τ , and dτ are chosensuch that the errors due to GPU and CPU based simplificationare similar, for fair comparison of associated speed-up in eachcase.Model simplification performed on the CPU lead to an av-erage factor of speed-up of 6.4 and average force error of 1.25%over the CPU baseline. It can be seen in Fig. 6 that for single runthe CPUbased simplification approach outperforms the purelyGPUbased approach by a factor of 1.5 and for two runs the CPUbased simplification approach is better than purely GPU basedapproach by a factor of 1.1. Again the reason is the appropria-tion of computing time spent on the constant time data trans-fer operations over larger number of runs. It is thus evidentthat using a purely GPU based approach may not be enoughin applications in which a single USV needs to be run in a VE,as themodel can becomemore complex, stretching the GPU toits limits. In applications, where some error is tolerable,modelsimplification can significantly speed-up the application at thecost of small errors.

(iv) For a larger number of runs, which are pertinent to applica-tions such as transition probability estimation, the GPU based


(a) Typical pose of USV in XYplane.

(b) Location of USV in state space. (c) Action discretization.

Fig. 7. State–action space of MDP.

approach gives a very high speed-up factor (in case of Fig. 6,about 43.1 with average mean square force error of 1.04%).

(v) Fig. 6 also shows that the computing speed-up due to tempo-ral coherence, gradually saturates as the number of simultane-ous runs increases. The reason is that the possibility of manyheightfields being less than the threshold simultaneouslyreduces as the number of heightfields increase.

5. Application of generated state transition map in trajectoryplanning of USVs

In this section we present the application of state transitionprobabilities obtained from model simplification techniques, de-veloped in Section 4 in the area of trajectory planning of USVs.

5.1. MDP formulation

In this section we present the MDP formulation for the tra-jectory planning problem and the algorithms to compute variouscomponents of the MDP.

5.1.1. State–action space representationFor the dynamics computations, the state of the USV is defined

as an augmented vector of pose and velocity according to Eq. (7).The sizes of the vectors representing pose and velocity are 6 each,making the size of the state vector as 12. In addition, the USVstate–action space is continuous and it is very difficult to search anoptimal policy, in such a high dimensional and continuous space. Inthis sectionwepresent the state–action space dimension reductionand a suitable discretization to pose the problem of trajectoryplanning of USVs as a MDP.

The motion goals for the USVs are usually specified in termsof the target pose [x, y, θ ]T and the target velocity

vx, vy, ωz

Ton the ocean’s nominal water plane. This means that the statespace for theMDP can be reduced to

x, y, θ, vx, vy, ωz

T . Duringoperation the velocity of USVs do not change significantly duringthe operations and that justifies the choice of the constant desiredvelocity of the boat. Also we tuned the PID controller such thatthe angular velocity is kept within a set bound. The sway velocityand the angular velocities in the roll and the pitch directions aregenerally very small and can be ignored in the MDP state space.Nevertheless, the transition model computation is performedusing all the 12 DOFs. The state space for the planning purposesin this paper is thus, reduced to a 3-tuple given by Eq. (10).

s = [x, y, θ ]T , (10)

where (x, y) are the coordinates of the USV’s CG in the XY plane,and θ is the orientation of the boat in the XY plane.

We chose the grid dimension to be 15.0 m (USV length is nearlyequal to 12 m). The orientation discretization is chosen to be0.524 rad. The state space is depicted in Fig. 7, in which the XYplane contains the location of the CG and the θ axis denotes theorientation of the boat. Pose of the boat in the XY plane is shownin Fig. 7(a) and the corresponding location in the 3-D state space isshown in Fig. 7(b).

The action space is discretized as a set of relative posecommands from an initial state. We chose the set of relative finalpose as 7 radial pose vectors having radial distance of 30.0 m withfinal desired steering angles varying from −2.142 to 2.142 rad inthe increments of 0.428 rad. We chose the desired path lengthsand the steering angles by first running the boat for 10 s alongpolar directions. Using this technique we generated a set of 7waypoints which sufficiently cover the space around the boat andare dynamically reachable in 10 s.

5.1.2. State transition map computationThe state transition map for the continuous space was

expressed in the formof Eq. (7). In this sectionwe shall describe thecomputation scheme to determine the state transition map for thegivenUSV in thediscrete state space.We first present thedefinitionof the state transition probability.

Definition 3. Given an initial state xt and an action ut at time t , theprobability p(xt+∆t |xt , ut) of ending up in the state xt+∆t is calledthe state transition probability. We assume that the time taken toexecute the action ut is∆t .

Fig. 8 shows a USV situated initially in the state xt . When anaction ut is applied to it for 256 sample runs, the trajectories tracedby the USVs are shown in the figure. The USVs trace a slightlydifferent trajectory for each sample run due to the ocean waveand USVs interaction force as explained before (see Eq. (4)). Thevariation in the resulting states for a given initial state and anaction yields a probability distribution over the resulting states.

Weuse a data structure similar to state lattice [4] to consider thedynamics but enhance it by embedding the information of the statetransition probabilities in the arcs of the state lattice and thenmakeuse of the stochastic dynamic programming to solve the problemof trajectory planning under the motion uncertainty. The steps tocompute the state transition map are described below.


Fig. 8. State transition map computation.

Algorithm 3. State transition map computationInput

(a) Set of trajectories T = (T1, T2, . . . , TP) for a given action setΥ = (U1,U2, . . . ,UP) using Algorithms 1 and 2.

(b) List of discretized states S = [s1, s2, . . . , sL]T representing theregion of USV operation.

Output State transition map.Steps

(i) Perform geometric transformation of each trajectory in T .Fig. 8 shows a portion of the state space with the rectangulargrids representing region on the nominal water plane, whilethe layers representing the orientation of the vehicle thatvary from 0 to 2π rad. For each simulation run (beginningfrom si and taking action uj), the USV ends up in differentstates, shown by crosses in Fig. 8. State se represents thenominal ending state under the action uj when there is noenvironmental disturbance.

(ii) Construct a graph by connecting the states (nodes) reachablefrom another states (nodes) using the sample actions (arcs)in T .

(iii) Determine the transition probabilities by finding out the ratioof the number of connections between the two connectedstates and the total sample count of the actions taken. In Fig. 8,let the state sj be connected to the state si for n(sj) times out ofN sample runs. Compute the probability pij of transition fromstate si to state sj using pij =

n(sj)N .

(iv) Return the state transition map.

In this way, the state transition map is obtained, whose nodesare the states and the arcs are the dynamics constraints. Eachconnection has a probability associated with it, due to the systemdynamics and the presence of uncertainty due to ocean waves. Itshould be noted that the maximum number of children nodes ofa given parent node, for a state space with L nodes can be up toL based on the variations in the samples of the action and level ofuncertainty in the environment. Thismakes the data structure veryflexible in the sense of capturing the dynamics constraints and theenvironmental uncertainty for extreme situations such as roughsea-state.

5.1.3. Reward functionThe final element of MDP is the immediate reward for transi-

tioning from a given state to another state by taking an action.The time spent by the USV to perform an action, can be deter-mined by the length of the trajectory traversed by it. This entails

that larger the length of the trajectory, smaller should be the re-ward for the action, generating the trajectory, and thus, we shouldconsider the negative value of the trajectory length as the reward.We chose the negative of the average length of the S trajectoriestraversed by the USV for each action in the control set to determinethe reward.

5.2. Results of trajectory planning

For the given action set Υ described in Section 5.1.1, wedetermine the set of trajectories T , using Algorithms 1 and 2, forthe sea-states 3 and 4. The averagewave height, for the sea-state 3,ranges from 0.5 to 1.25m, whereas, for the sea-state 4, the averagewave height ranges from 1.25 to 2.5 m [7]. The resulting set oftrajectories, for 256Monte Carlo simulation runs is shown in Fig. 9.The variation in the resulting trajectories due to different actionsin the action set is shown in Fig. 9(a) and (b) for sea-state 3 and4 respectively. Variations in the final orientation in each MonteCarlo run due to ocean uncertainty for various actions are shown inFig. 9(c) and (d). The variations in the trajectories for the sea-state4 is larger compared to that for the sea-state 3 due to the higheraverage ocean wave height. We used τ = 0.075 and dτ = 0.1 toobtain Fig. 9. The time taken to compute the set of trajectories foreach sea-state was 9.1 min. We chose sample size of 256 becausethe variability of the pose data can be captured using a sample sizeof around 256. This is illustrated in Fig. 10.

The MDP formulated in the Section 5.1 can be solved usingnumerous algorithms including the value iteration, policy itera-tion, hierarchical techniques, approximate techniques, etc. [69].We chose the value iteration for computation of the solution. Thedetails of the value iteration algorithm could be found inmany ref-erences including [1,3,69]. The policies for each sea-state are com-puted using the value iteration over the obtained respective statetransition maps. The optimal policies are then used for determin-ing the trajectory to the target. Even if the USV is deviated fromthe planned trajectory (mainly, the orientation is changed signifi-cantly in high sea-state) due to forces caused by ocean waves, thecomputed policy can be used to regenerate an optimal trajectory tothe target state. The obtained trajectory is optimal given the uncer-tainties (caused by oceanwaves and unmodeled effects such as thevariations in the angular velocities and the linear velocity due tothe implemented PID controller limitations). The dynamics basedmotion model developed in this paper includes these variations inthe transition probability. The transition probability is then used tocompute the optimal policy in theMDP framework,which has beenproved by the researchers to yield global optima [2,69]. It shouldbe noted that the optimality is achieved in the resolution sense,meaning that the chosen resolution of the state–action space in-fluences the achieved optimality. Finer the resolution better willbe the optimal solution, but greater will be the computationalcomplexity.

The resulting trajectories for sea-states 3 and 4 are shown inFig. 11. The origin and target orientation of the vehicle is π2 . In caseof sea-state 3, a shorter and riskier trajectory (through a narrowpassage in the midst of obstacles) is computed by the approachdiscussed in this paper as shown in Fig. 11(a). In case of sea-state4 a longer but safer path is computed as shown in Fig. 11(b). Itshould be noted that because of the dynamics constraints built intothe search graph through the motion primitives discussed in thispaper, the generated trajectories are always dynamically feasible.This is the reason that the trajectory in case of Fig. 11(b) bends nearthe target, as the vehicle needs to go down in order to turn to thedesired orientation of π2 . The disturbances due to the ocean wavesare computed during the simulation, which causes unpredictabledeviations in the trajectory. The computed policymakes it possiblefor the USV to take best action to safely reach the target, even


(a) Trajectories for each action computed at sea-state 3. (b) Trajectories for each action computed at sea-state 4.

(c) Final orientations for each action computed at sea-state 3. (d) Final orientations for each action computed at sea-state 4.

Fig. 9. Control set computed for different sea-states.

(a) Variation in sample median and quartile distance fromnominal position.

(b) Variation in sample median and quartile orientation fromnominal orientation.

Fig. 10. Variation in sample distance and orientation from nominal pose vs sample size.

Table 2Summary of computational performance obtained using GPU and temporal coherence based simplification over CPU computations. Number of actions in action set P = 7and number of simulation runsM = 256.

CPU Baseline GPU Baseline CPU with clustering (C = 50) andtemporal coherence (τ = 0.070)

GPU with temporal coherence (τ = 0.075)

Computation time (min) 395.0 28.2 80.2 9.1Speed-up factor over CPU baseline 1.0 14.1 4.9 43.4Error introduced (%) 0.0 0.0 1.25 1.04


(a) Trajectory computed between origin and target states atsea-state 3.

(b) Trajectory computed between origin and target states atsea-state 4.

Fig. 11. Executed trajectories obtained by using a feedback plan for sea-states 3 and 4. The feedback plans are computed using the algorithms developed in this paper.

after it gets deviated due to the unpredictable ocean waves. Thedata structure and the algorithms presented in this paper, enablesincorporating the effects of dynamics and uncertainty in oceanwaves, in the computation of a state transition map, helping inperforming physics-aware trajectory planning.

The comparison of computational performance of GPU andCPU based model simplification techniques are shown in Table 2.The time taken to perform Monte Carlo simulations of the USVdynamics to compute a state transition map using the algorithmdeveloped in this paper was 9.1 min. It should be noted that theCPU baseline was computed by implementing the parallel version(using OpenMP) of the simulator code reported in Ref. [9]. Theoverall speed-up using a GPU in conjunction with a temporalcoherence basedmodel simplification technique was by a factor of43.4 by introducing an error of 1.04%. The number of states werechosen as 4800 (20 × 20 × 12) and the number of actions as 7.The value iteration took 28 s to converge when computed on thecomputed state transition map.

6. Conclusions

In this paper, we presented GPU based algorithms to computestate-transition probabilities for USVs using 6 DOF dynamicssimulations of interactions between ocean-wave and vehicle.We used the obtained state transition model in standard MDPbased planning framework to generate physics-aware trajectoryplans. We extended the dynamics model for USV simulation,reported in Ref. [9], to incorporate multiple wave componentsand random phase lags in ocean waves. The approach describedin this paper is flexible and is capable of handling any USVgeometry, dynamics parameters, and sea-state. We developedGPU based algorithm to perform fast Monte Carlo simulations toestimate state transition probabilities of USVs operating in highsea-states. We further improved the computational performanceby developing model simplification algorithm based on temporalcoherence. The overall computational speed-up obtained is by afactor of 43.4 by introducing a simulation error of 1.04% overthe CPU baseline. Transition probabilities can be computed within9.1 min by using the algorithms described in this paper. Thepaper also presents a case study to demonstrate the applicationof the developed state transition map to generate physics-awaretrajectory plans for sea-states 3 and 4 in MDP based planningframework.

A limitation of the approach is that we use a potentialflow theory based model, which is not suitable for simulatingplaning phenomenon occurring in fastmoving boats. The approachpresented in this paper, however, is flexible to incorporate anyother fluid flow model if available, to compute state transitionmodel that can subsequently be used in trajectory planning.Another limitation is that the current framework is only capableof handling dynamic environmental disturbances with staticobstacles, such as islands and shorelines. The trajectory planner,however, cannot handle dynamic obstacles with a sufficientefficiency, and a faster replanning approach can be used to tacklethis issue [70].

Acknowledgments

This research has been supported by the Office of NavalResearch grant N00014-10-1-0585 and NSF grant CMMI-0727380.Opinions expressed in this paper are those of the authors and donot necessarily reflect opinions of the sponsors.

References

[1] S.M. LaValle, PlanningAlgorithms, CambridgeUniversity Press, Cambridge, UK,2006, Available at http://planning.cs.uiuc.edu/.

[2] M.L. Puterman, Markov Decision Processes: Discrete Stochastic DynamicProgramming, first ed., John Wiley & Sons, Inc., New York, NY, USA, 1994.

[3] S. Thrun, W. Burgard, D. Fox, Probabilistic Robotics, MIT Press, 2005.[4] M. Pivtoraiko, R.A. Knepper, A. Kelly, Differentially constrained mobile robot

motion planning in state lattices, Journal of Field Robotics 26 (3) (2009)308–333.

[5] T. Schouwenaars, B.Mettler, E. Feron, J.P. How, Robustmotion planning using amaneuver automation with built-in uncertainties, in: Proceedings of the 2003American Control Conference, June 2003, pp. 2211–2216.

[6] O.M. Faltinsen, Sea Loads on Ships and Offshore Structures, CambridgeUniversity Press, Cambridge, New York, 1990.

[7] T.I. Fossen, Guidance and Control of Ocean Vehicles,Wiley, Chicester, England,1994.

[8] P. Krishnamurthy, F. Khorrami, S. Fujikawa, A modeling framework for sixdegree-of-freedom control of unmanned sea surface vehicles, in: Proceedingsof 44th IEEE European Control Conference Decision and Control CDC-ECC’05,December 12–15, 2005, pp. 2676–2681.

[9] A. Thakur, S.K. Gupta, Real-time dynamics simulation of unmanned sea surfacevehicle for virtual environments, Journal of Computing and InformationScience in Engineering 11 (3) (2011) 031005.

[10] S.P. Singh, D. Sen, A comparative linear and nonlinear ship motion study using3-D time domain methods, Ocean Engineering 34 (13) (2007) 1863–1881.

[11] J. Betz, Solving rigid multibody physics dynamics using proximal pointfunctions on the GPU. Master’s Thesis, Rensselaer Polytechnic Institute, Troy,New York, 2011.

http://planning.cs.uiuc.edu/


[12] J. Pan, D.Manocha, GPU-based parallel collision detection for real-timemotionplanning, in: David Hsu, Volkan Isler, Jean-Claude Latombe, Ming Lin (Eds.),Algorithmic Foundations of Robotics IX, in: Springer Tracts in AdvancedRobotics, vol. 68, Springer, Berlin, Heidelberg, 2011, pp. 211–228.

[13] W.L.D. Lui, R. Jarvis, Eye-full tower: a GPU-based variable multibaselineomnidirectional stereovision system with automatic baseline selection foroutdoor mobile robot navigation, Robotics and Autonomous Systems 58 (6)(2010) 747–761.

[14] J.T. Kider, M. Henderson, M. Likhachev, A. Safonova, High-dimensionalplanning on the GPU, in: 2010 IEEE International Conference on Robotics andAutomation, ICRA, May 2010, pp. 2515–2522.

[15] C. Batty, F. Bertails, R. Bridson, A fast variational framework for accurate solid-fluid coupling, ACM Transactions on Graphics 26 (3) (2007) 100.

[16] M. Carlson, P.J. Mucha, G. Turk, Rigid fluid: animating the interplay betweenrigid bodies and fluid, ACM Transactions on Graphics 23 (3) (2004) 377–384.

[17] M. Becker, H. Tessendorf, M. Teschner, Direct forcing for Lagrangian rigid-fluidcoupling, IEEE Transactions on Visualization and Computer Graphics 15 (3)(2009) 493–503.

[18] M. Garcia, J. Gutierrez, N. Rueda, Fluid-structure coupling using lattice-Boltzmann and fixed-grid FEM, in: Computational Mechanics and Design,Finite Elements in Analysis and Design 47 (8) (2011) 906–912.

[19] R. Geist, C. Corsi, J. Tessendorf, J. Westall, Lattice–Boltzmann water waves,in: George Bebis, Richard Boyle, Bahram Parvin, Darko Koracin, Ronald Chung,Riad Hammoud, Muhammad Hussain, Tan Kar-Han, Roger Crawfis,Daniel Thalmann, David Kao, Lisa Avila (Eds.), Advances in Visual Com-puting, in: Lecture Notes in Computer Science, vol. 6453, Springer, Berlin,Heidelberg, 2010, pp. 74–85.

[20] M. Geveler, D. Ribbrock, D. Goddeke, S. Turek, Lattice–Boltzmann simulationof the shallow-water equations with fluid-structure interaction on multi-and many-core processors, in: Rainer Keller, David Kramer, Jan-Philipp Weiss(Eds.), Facing the Multicore-Challenge, in: Lecture Notes in Computer Science,vol. 6310, Springer, Berlin, Heidelberg, 2011, pp. 92–104.

[21] R. Beck, A. Reed, Modern seakeeping computations for ships, in: Twenty-ThirdSymposium on Naval Hydrodynamics. Naval Studies Board (NSB), 2001.

[22] J. Craighead, R. Murphy, J. Burke, B. Goldiez, A survey of commercial opensource unmanned vehicle simulators, in: 2007 IEEE International Conferenceon Robotics and Automation, 2007, pp. 852–857.

[23] J.J. Gorski, Present state of numerical ship hydrodynamics and validationexperiments, Journal of Offshore Mechanics and Arctic Engineering 124 (2)(2002) 74–80.

[24] K.H. Kim, Simulation of surface ship dynamics using unsteady RANS codes, in:Reduction of Military Vehicle Acquisition Time and Cost through AdvancedModelling and Virtual Simulation, Paris, France, 2002.

[25] K.H. Kim, J. Gorski, R. Miller, R. Wilson, F. Stern, M. Hyman, C. Burg, Simulationof surface ship dynamics, in: Proceedings of UserGroupConference, June 2003,pp. 188–199.

[26] G.D. Weymouth, R.V. Wilson, F. Stern, RANS computational fluid dynamicspredictions of pitch and heave ship motions in head seas, Journal of ShipResearch 49 (18) (2005) 80–97.

[27] P.M. Carrica, R.V. Wilson, R.W. Noack, F. Stern, Ship motions using single-phase level set with dynamic overset grids, Computers & Fluids 36 (9) (2007)1415–1433.

[28] H. Ashrafiuon, K.R. Muske, L.C. McNinch, Review of nonlinear trackingand setpoint control approaches for autonomous underactuated marinevehicles, in: American Control Conference ACC, 2010, 302010-July 2, 2010,pp. 5203–5211.

[29] M.R. Katebi, M.J. Grimble, Y. Zhang, H ∞ robust control design for dynamicship positioning, Proceedings of IEEE Control Theory and Applications 144 (2)(1997) 110–120.

[30] A. Loria, T.I. Fossen, E. Panteley, A separation principle for dynamic positioningof ships: theoretical and experimental results, IEEE Transactions on ControlSystems Technology 8 (2000) 332–343.

[31] F. Mazenc, K. Pettersen, H. Nijmeijer, Global uniform asymptotic stabilizationof an underactuated surface vessel, IEEE Transactions on Automatic Control 47(10) (2002) 1759–1762.

[32] K.D. Do, Z.P. Jiang, J. Pan, Underactuated ship global tracking under relaxedconditions, IEEE Transactions on Automatic Control 47 (9) (2002) 1529–1536.

[33] E. Lefeber, K.Y. Pettersen, H. Nijmeijer, Tracking control of an underactuatedship, IEEE Transactions on Control Systems Technology 11 (1) (2003) 52–61.

[34] T.I. Fossen, Ø.N. Smogeli, Nonlinear time-domain strip theory formulation forlow-speed manoeuvering and station-keeping, Modeling, Identification andControl 25 (4) (2004) 201–221.

[35] P.J. Bandyk, R.F. Beck, Nonlinear ship motions in the time-domain using abody-exact strip theory, in: ASME Conference Proceedings, 2008, pp. 51–60.

[36] D.S. Holloway, M.R. Davis, Ship motion computations using a high Froudenumber time domain strip theory, Journal of Ship Research 50 (1) (2006)15–30.

[37] J.N. Newman, Marine Hydrodynamics, MIT Press, Cambridge, MA, 1977.[38] C. Goerzen, Z. Kong, B. Mettler, A survey of motion planning algorithms from

the perspective of autonomous UAV guidance, Journal of Intelligent & RoboticSystems 57 (2010) 65–100. http://dx.doi.org/10.1007/s10846-009-9383-1.

[39] B. Donald, P. Xavier, J. Canny, J. Reif, Kinodynamic motion planning, Journal ofACM 40 (1993) 1048–1066.

[40] S.M. Lavalle, P. Konkimalla, Algorithms for computing numerical optimalfeedback motion strategies, The International Journal of Robotics Research 20(9) (2001) 729–752.

[41] E. Frazzoli, M.A. Dahleh, E. Feron, Real-time motion planning for agileautonomous vehicles, in: Proceedings of 2001 American Control Conference,vol. 1, 2001, pp. 43–49.

[42] S.M. LaValle, J.J. Kuffner, Randomized kinodynamic planning, The InternationalJournal of Robotics Research 20 (5) (2001) 378–400.

[43] D. Hsu, R. Kindel, J.C. Latombe, S. Rock, Randomized kinodynamic motionplanning with moving obstacles, The International Journal of RoboticsResearch 21 (3) (2002) 233–255.

[44] R. Stelzer, T. Proell, Autonomous sailboat navigation for short course racing,Robotics and autonomous systems 56 (7) (2008) 604–614.

[45] S. Suzuki, Online four-dimensional flight trajectory search and its flight testing,in: AIAA GNC Conference and Exhibit, 2005, 2005.

[46] M. Bennewitz, W. Burgard, S. Thrun, Finding and optimizing solvable priorityschemes for decoupled path planning techniques for teams of mobile robots,Robotics and autonomous systems 41 (2–3) (2002) 89–99.

[47] S. Scherer, S. Singh, L. Chamberlain, S. Saripalli, Flying fast and low amongobstacles, in: Proceedings of 2007 IEEE International Conference on Roboticsand Automation, April 2007, pp. 2023–2029.

[48] E. Frazzoli, M.A. Dahleh, E. Feron, A hybrid control architecture for aggressivemaneuvering of autonomous helicopters, in: Proceedings of 38th IEEEConference on Decision and Control, vol. 3, 1999, pp. 2471–2476.

[49] A. Kelly, A. Stentz, O. Amidi, M. Bode, D. Bradley, A. Diaz-Calderon, M.Happold, H. Herman, R. Mandelbaum, T. Pilarski, P. Rander, S. Thayer, N.Vallidis, Ra. Warner, Toward reliable off road autonomous vehicles operatingin challenging environments, The International Journal of Robotics Research25 (5-6) (2006) 449–483.

[50] M.S. Branicky, S.M. LaValle, K. Olson, Y. Yang, Quasi-randomizedpath planning,in: Proceedings of IEEE International Conference on Robotics and Automation,vol. 2, 2001, pp. 1481–1487.

[51] A. Elnagar, A. Hussein, On optimal constrained trajectory planning in 3denvironments, Robotics and Autonomous Systems 33 (4) (2000) 195–206.

[52] A. Richards, J.P. How, Aircraft trajectory planning with collision avoidanceusing mixed integer linear programming, in: Proceedings of the 2002American Control Conference, vol. 3, 2002, pp. 1936–1941.

[53] F. Borrelli, D. Subramanian, A.U. Raghunathan, L.T. Biegler, MILP and NLPtechniques for centralized trajectory planning of multiple unmanned airvehicles, in: Proceedings of 2006 American Control Conference, 2006, June2006, p. 6.

[54] M.G. Earl, R. D’Andrea, Iterative MILP methods for vehicle-control problems,IEEE Transactions on Robotics 21 (6) (2005) 1158–1167.

[55] C. Colombo, M. Vasile, G. Radice, Optimal low-thrust trajectories to as-teroids through an algorithm based on differential dynamic program-ming, Celestial Mechanics and Dynamical Astronomy 105 (2009) 75–112.http://dx.doi.org/10.1007/s10569-009-9224-3.

[56] T. Howard, A. Kelly, Optimal rough terrain trajectory generation for wheeledmobile robots, International Journal of Robotics Research 26 (1) (2007)141–166.

[57] O. Purwin, R. Andrea, Trajectory generation and control for four wheeledomnidirectional vehicles, Robotics and Autonomous Systems 54 (1) (2006)13–22.

[58] S. Russell, P. Norvig, Artificial Intelligence: A Modern Approach, Prentice Hall,2009.

[59] M. Likhachev, G. Gordon, S. Thrun, Planning for Markov decision processeswith sparse stochasticity, in: Advances in Neural Information ProcessingSystems (NIPS), vol. 17, 2004.

[60] D. Ferguson, A. Stentz, Focussed processing of MDPs for path planning, in:Proceedings of the 16th IEEE International Conference on Tools with ArtificialIntelligence, ICTAI’04, 2004, pp. 310–317.

[61] S. Sanner, R. Goetschalckx, K. Driessens, G. Shani, Bayesian real-time dynamicprogramming, in: Proceedings of the 21st International Joint Conference onArtificial Intelligence,Morgan Kaufmann Publishers Inc., 2009, pp. 1784–1789.

[62] G. Casalino, A. Turetta, E. Simetti, A three-layered architecture for real timepath planning and obstacle avoidance for surveillance USVs operating inharbour fields, in: Oceans 2009 — Europe, 11–14 May 2009, pp.1–8.

[63] M.R. Benjamin, J.A. Curcio, P.M. Newman, Navigation of unmanned marinevehicles in accordance with the rules of the road, in: Proceedings of 2006IEEE International Conference on Robotics and Automation, May 2006,pp. 3581–3587.

[64] R.A. Soltan, H. Ashrafiuon, K.R. Muske, State-dependent trajectory planningand tracking control of unmanned surface vessels, in: Proceedings of 2009American Control Conference, 2009, pp. 3597–3602.

[65] B. Xu, A. Kurdila, D.J. Stilwell, A hybrid receding horizon control methodfor path planning in uncertain environments, in: Proceedings of 2009IEEE/RSJ International Conference on Intelligent Robots and Systems, 2009,pp. 4887–4892.

[66] M. Sandler, A. Wahl, R. Zimmermann, M. Faul, U. Kabatek, E.D. Gilles,Autonomous guidance of ships on waterways, Robotics and AutonomousSystems 18 (3) (1996) 327–335.

[67] M.T. Wolf, L. Blackmore, Y. Kuwata, N. Fathpour, A. Elfes, C. Newman,Probabilistic motion planning of balloons in strong, uncertain wind fields, in:International Conference on Robotics and Automation 2010, May 2010, pp.1123–1129.

http://dx.doi.org/doi:10.1007/s10846-009-9383-1

http://dx.doi.org/doi:10.1007/s10569-009-9224-3


[68] I. Zohar, A. Ailon, R. Rabinovici, Mobile robot characterized by dynamicand kinematic equations and actuator dynamics: Trajectory tracking andrelated application, Robotics and autonomous systems 59 (6) (2011)343–353.

[69] W. Powell, Approximate Dynamic Programming: Solving the Curses ofDimensionality, Wiley, 2007.

[70] P. Svec, M. Schwartz, A. Thakur, S.K. Gupta, Trajectory planning withlook-ahead for unmanned sea surface vehicles to handle environmentaldisturbances, in: 2011 IEEE/RSJ International Conference on Intelligent Robotsand Systems (IROS), September 2011.

Atul Thakur received his Ph.D. degree in mechanicalengineering from the University of Maryland (UMD),College Park, in 2011; his Master of Technology (M.Tech.)degree in manufacturing engineering from the IndianInstitute of Technology,Mumbai, in 2006; and his Bachelorof Engineering (B.E.) degree in production engineeringfrom the University of Mumbai in 2003. He is currentlya Post-Doctoral Research Associate in the Department ofMechanical Engineering at UMD. Dr. Thakur is a Memberof ASME and IEEE. His research interests are in the area ofphysics-aware model simplification for high-fidelity and

high performance robot dynamics simulations and planning.

Dr. Petr Svec is an Assistant Research Scientist atthe University of Maryland, College Park. His researchinterests are in the area of Robotics, Machine Learning,Motion Planning, Computational Geometry, and GraphTheory. Dr. Svec received a Master’s degree in ComputerScience from the Faculty of Mechanical Engineering atBrno University of Technology in 2003, followed by Ph.D.with a focus on robot motion planning from the sameinstitute. He also received a second Master’s degree inComputer Science in 2006 from Masaryk University. Inaddition, he spent one year as a Graduate Research

Assistant at the Department of Computer Science at the University of Bristol inthe UK.

Satyandra K. Gupta received his Master of Technology(M.Tech.) degree in production engineering from theIndian Institute of Technology, Delhi, in 1989; his Bachelorof Engineering (B.E.) degree from theUniversity of Roorkee(currently known as the Indian Institute of Technology,Roorkee) in 1988; and his Ph.D. degree in mechanicalengineering from the University of Maryland (UMD),College Park, in 1994. He is currently a Professor in theDepartment of Mechanical Engineering and the Institutefor Systems Research at UMD. Dr. Gupta is a Fellow ofASME.

GPU based generation of state transition models using simulations for unmanned surface vehicle...

Documents

Transcript of GPU based generation of state transition models using simulations for unmanned surface vehicle...