Download - Fem Navier Stokes 1

7/29/2019 Fem Navier Stokes 1

1/18

A new parallel finite element algorithm for the stationaryNavierStokes equations

Yueqiang Shang a,b,, Yinnian He c, Do Wan Kim a, Xiaojun Zhou b

a Department of Mathematics, Inha University, Incheon, 402-751, Republic of Koreab School of Mathematics and Computer Science, Guizhou Normal University, Guiyang, 550001, PR Chinac Faculty of Science, Xian Jiaotong University, Xian, 710049, PR China

a r t i c l e i n f o

Article history:Received 20 September 2010

Received in revised form

20 May 2011

Accepted 1 June 2011Available online 2 July 2011

Keywords:

NavierStokes equations

Finite element

Parallel computing

Parallel algorithm

Two-grid method

Domain decomposition

a b s t r a c t

Based on two-grid discretization, a new parallel finite element algorithm for the stationaryNavierStokes equations is proposed and analyzed. This algorithm first solves the NavierStokes

equations using a coarse grid, and then corrects the resultant residual on a fine grid by solving local

NavierStokes equations in a parallel manner with homogeneous boundary conditions. Existing

sequential NavierStokes solver is available for each problem on sub-domains, so that the proposed

parallel algorithm can be implemented on the top of existing sequential software. The error bounds of

the approximate solution are estimated. Moreover, the efficiency of the algorithm is also demonstrated

by numerical simulations of the lid-driven cavity flow, the backward-facing step flow, and the flow past

a circular cylinder.

& 2011 Elsevier B.V. All rights reserved.

1. Introduction

Computational fluid dynamics models are in general based on

the solution of the NavierStokes equations and its discretization

scheme, for instance, finite element methods and finite volume

methods. To accurately capture the physical properties of the fluid

flow being simulated, we usually need highly refined meshes on the

entire flow domain which can cause a large scale computation

possibly beyond the capability of a single computer. Therefore, to

utilize the computational power of modern high-performance

parallel computers, much effort is thrown into the development of

efficient parallel computing methods for the NavierStokes equa-

tions and related flow problems (see, e.g., [17]).

Recently, local and parallel algorithms for the stationaryNavierStokes equations were proposed and analyzed in [810],

respectively, based on a new approach to local and parallel finite

element computations [11,12] together with the fact that the global

behavior of a solution to the NavierStokes equations is mostly

dominated by the low frequency components and, on the contrary,

the local behavior is basically affected by high frequency compo-

nents. Such algorithms were numerically compared in [13]. The key

idea of these algorithms is to use the classical finite element

discretization on a coarse grid to approximate the low frequencies,

and then employ linearizations on local fine grids to correct the

resultant residual of high frequencies. Theoretical analysis shows

that these algorithms can yield the same order of convergence rate

as in the classical Galerkin finite element method if appropriate ratio

between coarse mesh size and fine mesh size is taken. However,

although the coarse grid size is suitably chosen in some cases of

incompressible flows, numerical computation showed that the finite

element solutions obtained from these local and parallel algorithms

are inaccurate particularly for the pressure when the overlapping

size of the sub-domains is small.

The objective of this paper is that we employ the local and

parallel finite element computations approach of Xu and Zhou[11,12] to develop an efficient parallel finite element algorithm

for the d-dimensional stationary NavierStokes flows d 2,3.

This novel algorithm is based on a coarse grid finite element

solution to the global NavierStokes equations and fine grid

solutions to local NavierStokes equations defined on overlapped

sub-domains. Here, the nonlinear problems are solved by means

of linearization methods such as Newton and Picard iterations.

Since existing sequential solvers are available for problems on

sub-domains, our method can be easily implemented on top of

the existing sequential software.

It is of worth to mention that similar two-level or multi-level

methods for the NavierStokes equations were proposed in

Contents lists available at ScienceDirect

journal homepage: www.elsevier.com/locate/finel

Finite Elements in Analysis and Design

0168-874X/$- see front matter& 2011 Elsevier B.V. All rights reserved.

doi:10.1016/j.finel.2011.06.001

Corresponding author. Tel.: 82 32 860 8819.

E-mail addresses: [email protected],

[email protected] (Y.Q. Shang), [email protected] (Y.N. He),

[email protected] (D.W. Kim), [email protected] (X.J. Zhou).

Finite Elements in Analysis and Design 47 (2011) 12621279
http://www.elsevier.com/locate/finelhttp://localhost/var/www/apps/conversion/tmp/scratch_9/dx.doi.org/10.1016/j.finel.2011.06.001http://localhost/var/www/apps/conversion/tmp/scratch_9/dx.doi.org/10.1016/j.finel.2011.06.001http://localhost/var/www/apps/conversion/tmp/scratch_9/dx.doi.org/10.1016/j.finel.2011.06.001http://localhost/var/www/apps/conversion/tmp/scratch_9/dx.doi.org/10.1016/j.finel.2011.06.001http://www.elsevier.com/locate/finel


2/18

[1419] since the pioneering work of Xu [20]. The major differ-

ence between those methods and our method is laid on the fact

that the coarse grid solution is used to linearize the nonlinear

convection term on the finer grid(s) in those methods but in our

method a predictioncorrection-type approach is employed.

Indeed, the solution is first predicted on a coarse grid and then

we correct it by solving the residual equations on the fine grid in a

parallel manner.

Our method is also reminiscent of the nonlinear Galerkinmethods (cf. [2124]). However, there are several essential

differences between the nonlinear Galerkin methods and our

method. First, the velocity is separated into two parts, small and

large eddies components, in the nonlinear Galerkin method.

While in our method, both the velocity and pressure are decom-

posed into low and high frequency components. Second, the

coarse grid solution and the fine grid correction in our method

are uncoupled in the computational process. They are calculated

sequentially, while, in the nonlinear Galerkin methods, such

calculations are coupled together. Third, our method is parallel

computing version. One can expect that a global solver may yield

more accurate solution than our parallel solver. As mentioned

before, however, the amount of storage desired by the global

solver often exceeds the capacity of modern computers.

The current method proposed in this paper also differs from

the classical two-level Schwarz methods (cf. [25,26,1]) in that the

global coarse grid problem and the fine grid local problems need

to be solved only once; moreover, in solving local problems, there

is no communication between processors for our method. It is

also a distinct feature of our method that it is to design a

discretization scheme compared to the methods in [29,27,28],

where the two-level nonlinear methods were used as precondi-

tioners. Moreover, in the present method, the coarse grid problem

does not have to be coupled with the local fine grid problems.

The rest of the paper is organized as follows. In the next section,

the NavierStokes equations and their mixed finite element approx-

imations are provided. In Section 3, based on two-grid finite element

discretization and domain decomposition, a new parallel algorithm

is designed and analyzed. Numerical results on some benchmarkproblems such as the lid-driven cavity flow, the backward-facing

step flow and flow past a circular cylinder are given in Section 4.

Finally, conclusions are drawn in Sections 5.

2. Preliminaries

Let O be a bounded domain with Lipschitz-continuous bound-

ary @O in Rd d 2,3 and satisfy an additional condition stated in

assumption H0 below. As usual, for a nonnegative integer k, we

denote by HkO the Sobolev space of functions with square

integrable distribution up to order k in O, equipped with

the standard norm J Jk,O, while denote by H10 O the closed

subspace of H1O consisting of functions with zero trace on @O,

see, e.g., [30,31]. Throughout this paper, we shall use the letter c

(with or without subscripts) to denote a generic positive constant

which is independent of mesh parameter and may take on

different values on different occurrences.

2.1. The NavierStokes equations

We consider the following incompressible NavierStokes

equations:

nDuu rurp f in O, 2:1a

div u 0 in O, 2:1b

u 0 on @O, 2:1c

where u u1, . . . ,udT is the velocity, p the pressure, f f1, . . . ,fd

T

the prescribed body force and n the kinematic viscosity. Given acharacteristic length L and a characteristic velocity U, the Reynolds

number is defined as ReUL=n.To introduce the variational formulation of (2.1), we set

X H10Od, Y L2Od, M L20O qAL

2O :

ZO

q dx 0

,

and define a,, b, ,, d, as

au,v nru,rv, bu,v,w 12 u rv,w12u rw,v,

dv,q div v,q, 8u,v,wAX, qAM,

where , is the standard inner-product of L2Ol l 1,2,3.

As mentioned above, a further assumption on O is needed:

H0. Assume that O is regular in the sense that the unique

solution u,qAX M of the steady Stokes problem

nurq g, div u 0 in O, uj@O 0,

for prescribed gAY exists and satisfies

JuJ2,O JqJ1,OrcJgJ0,O:

It is noted that the validity of Assumption H0 is known if@O is C2,

or ifO is a two-dimensional convex polygon; see [32,33].

With the above notations, the variational formulation of (2.1)

reads: find a pair u,pAX M such that

au,v bu,u,vdv,p f,v, 8vAX, 2:2a

D1

D2 D4

D3

Fig. 1. Triangulation (left) and decomposition (right) of the solution domain.

Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 12621279 1263


3/18

du,q 0, 8qAM: 2:2b

Defining

N : supu,v,wAX,u,v,wa0

jbu,v,wj

JruJ0,OJrvJ0,OJrwJ0,O,

we have the following existence and uniqueness results (cf. [34,35]).

Lemma 2.1. Given fAX0 (the dual space of X), there exists at least a

solution pair u,pAX M satisfying (2.2) and

JruJ0,Orn1

JfJ1,O, JfJ1,O supvAX,va0

jf,vj

JrvJ0,O: 2:3

Moreover, ifn and f satisfy the following uniqueness condition

1NJfJ1,O

n240, 2:4

then the solution pair (u,p) of problem (2.2) is unique.

2.2. Mixed finite element spaces

To describe the mixed finite element approximations of

problem (2.2), let us assume ThO fKg be a shape-regular

triangulation (see, e.g., [35,31]) ofO into triangles or quadrilat-erals (if d 2), or tetrahedrons or hexahedrons (if d 3) with

mesh-size function h(x) whose value is the diameter hK of the

element K containing x, satisfying the following assumption:A0. Triangulation. There exists gZ1 such that

hgOrchx, 8 xAO, 2:5

where hO maxxAOhx is the largest mesh size of ThO. Some-

times, we shall use h instead ofhO for the mesh size on a domain

that is clear from the context.

Let XhO & H1Od,MhO & L

2O be two finite element sub-

spaces associated with a mesh ThO and

X0h O XhO \ H10 O

d, M0hO MhO \ L

20O:

Given a sub-domain G &O, we define XhG,

MhG,

and Th

G to bethe restriction of XhO, MhO and T

hO to G, respectively, and

set

Xh0 G fvAXhO : supp v & & Gg, Mh0 G fqAMhO : supp q & & Gg:

We shall not restrict our attention to any specific mixed finite

element space; rather we shall study a class of mixed finite element

spaces satisfying the following assumptions (cf. [11,3638]).

A1. Approximation. For each u,pAHt 1Gd HtGtZ1,

there exists an approximation phu,rhpAXhG MhG such that

Jh1uphuJ0,G JuphuJ1,GrchsGJuJ1 s,G, 0rsrt, 2:6

Jh1prhpJ1,G JprhpJ0,GrchsGJpJs,G, 0rsrt: 2:7

A2. Inverse estimate. For any v,

qA

XhG MhG, there holdJvJ1,GrcJh

1vJ0,G, JqJ0,GrcJh1qJ1,G: 2:8

A3. Superapproximation. For G &O, let oAC10 O withsupp o& & G. Then for any u,pAXhG MhG, there isv,qAXh0 G M

h0G such that

Jh1ouvJ1,GrcJuJ1,G, Jh1opqJ0,GrcJpJ0,G: 2:9

A4. Infsup condition. There exists a constant b40 such that

bJqJ0,Gr supvAX0

hG,

va0

div v,q

JrvJ0,G, 8qAM0h G: 2:10

We refer to [39] for some examples satisfying Assumptions

A1A4. For instance, the MINI finite elements [40] and the

P2P0 finite elements [41] satisfy Assumptions A1A4 when

t1, while the Taylor-Hood elements [42] and the augmentedP2P1 elements [43,44] satisfy Assumptions A1A4 when t2.

The mixed finite element approximation of problem (2.2)

reads: find a pair uh,phAX0h O M

0hO such that

auh,v buh,uh,vdv,ph f,v, 8vAX0h O, 2:11a

duh,q 0, 8qAM0h O: 2:11b

The following results on uh,

ph are classical (cf. [34,35]).

Lemma 2.2. Under Assumptions A0, A1 and A4, there exists a

small h040 such that for all hA0,h0, problem (2.11) admits a

unique solution uh,ph. Moreover, if u,pAHt 1O \ H10 O

d

HtO \ L20O, then the following error estimate holds:

JuuhJ1,O JpphJ0,OrchsJuJs 1,OJpJs,O, 1rsrt: 2:12

3. Parallel finite element algorithms

In this section, we first recall a parallel algorithm based on

local finite element computations proposed in [9] for the steady

NavierStokes equations, and then give an analysis for improve-

ment and introduce our new parallel finite element algorithmbased on two-grid discretization.

Let us first divide O into a number of disjoint sub-domains

D1, . . . ,Dm, and then enlarge each Dj to obtain Oj such that

Dj & &Oj &O j 1,2, . . . ,m, here Dj & &Oj &O means that

dist@Dj\@O,@Oj\@O40). These Ojs are an overlapping decom-

position of O. Assume THO to be a shape-regular coarse grid

with size Hbh, ThOj a local shape-regular fine grid of subdo-

main Oj and ThO a global fine grid which coincides with the

local fine grid in sub-domainOj. We are interested in obtaining an

approximate solution in sub-domains Dj j 1,2, . . . ,m with an

accuracy comparable to that of the classical finite element

solution uh,ph from ThO.

3.1. A parallel linearized algorithm

The parallel finite element algorithm based on two-grid dis-

cretization proposed in [9] for the stationary NavierStokes

equations reads:

Algorithm 1. Parallel linearized finite element algorithm.

1. Find a global coarse grid solution uH,pHAX0HO M

0HO

such that

auH,v buH,uH,vdv,pH f,v, 8vAX0HO,

duH,q 0, 8qAM0HO:

2. Find local fine grid corrections eh,j,Zh,jAX0h Oj M

0hOj

j 1, 2, . . . ,m in parallel:aeh,j,v beh,j ,uH,v buH,eh,j,vdv,Zh,j Rj,v, 8vAX

0h Oj,

deh,j,q duH,q, 8qAM0h Oj:

Table 1

Errors of the solutions obtained from Algorithm 1.

h H CPU(s) itC Jjruuh jJ0,O

JruJ0,OJjpph jJ0,O

JpJ0,OIph Rate

127

118

2 .393 3 0.003 819 17 1 4.13 79 0.16 368 6

164

132

8 .476 3 0.00072 539 7 2.2 5073 0.33 1027 2.1 288 3

1125

150

2 5.23 6 3 0 .0 00 19 08 47 0 .3 39 69 7 0 .0 48 23 35 2 .8 22 48

Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 126212791264


4/18

3. Set uh,ph uH,pH eh,j,Zh,j in Dj j 1,2, . . . ,m.

Here and hereafter,

Rj,v f,vauH,vbuH,uH,v dv,pH,

8vAX0h Oj, j 1,2, . . . ,m: 3:1

Remark 3.1. Similar parallel linearized algorithms were also

proposed for the stationary NavierStokes equations in [10,13],respectively. They differ from Algorithm 1 in that they solve a

different linearized problem on the fine grid; see [10,13] for

details.

Defining piecewise norms

JjruuhJj0,O Xm

j 1

JruuhJ20,Dj

0@

1A

1=2

,

JjpphJj0,O Xm

j 1

JpphJ20,Dj

0@

1A

1=2

,

we have the following error estimates (see [9]).

Theorem 3.1. Assume that Dj & &Oj &O j 1,2, . . . ,m, Assump-tions A0A4, Lemmas 2.1 and 2.2 hold, and uh,ph is obtained from

Algorithm 1. Then

JjruhuhJj0,O Jjphp

hJj0,OrcH

s 1JuJs 1,O JpJs,O, 1rsrt:

Consequently,

JjruuhJj0,OJjpphJj0,Orch

s Hs 1JuJs 1,OJpJs,O, 1rsrt:

Theorem 3.1 shows that if the ratio of coarse mesh size H to

fine mesh size h is suitably chosen, Algorithm 1 can yield the

same order of convergence rate as the classical Galerkin finite

element method and may provide asymptotically optimal errors

for the approximate solution.

However, detailed analysis and numerical tests showed that

still there is room to improve the above algorithm. To begin with,

let us consider the approximate pressure obtained from Algo-

rithm 1 and set

Iph :Xm

j 1

ZDj

ph dx

: 3:2

From problem (2.1), it is clear that the pressure is a function ofL2O which is defined up to an additive constant. This issue can

be circumscribed by considering one of the two solutions: the

first one is to look for a pressure with a vanishing average in O,

i.e., belonging to the space L20O; the second one is to seek a

pressure belonging to L2O\R. Obviously, Algorithm 1 adopts the

Table 2

Errors of the solutions obtained from Algorithm 2.

h H CPU (s) itC itF Jjruuh Jj0,O

JruJ0,O

JjpphJj0,OJpJ0,O

Iph Rate

127

118

2.697 3 3 0.00381339 0.000680126 2.47622e005

164

132

10.684 3 3 0.000720109 9.14859e 005 1.04331e006 1.94062

1125

150

29.382 3 2 0.000187746 3.08976e005 2.03235e007 1.99942

5 4.8 4.6 4.4 4.2 4 3.8 3.6 3.4 3.2

10

9.5

9

8.5

8

7.5

7

6.5

6

5.5

log(h)

log(error)

Algorithm 1

Algorithm 2

Classical FEM

h2

5 4.8 4.6 4.4 4.2 4 3.8 3.6 3.4 3.212

10

8

6

4

2

0

2

4

log(h)

log(error)

Algorithm 1

Algorithm 2

Classical FEM

h2

Fig. 2. H1

-error for the velocity (left) and L2

-error for the pressure (right).

Table 3

Comparison of the two strategies.

h H Zero-restriction of pressure on artificial

boundaries

Nonlinear corrections

JjpphJj0,OJpJ0,O

Iph JjpphJj0,O

JpJ0,O

Iph

127

118

0.000693688 1.48964e005 14.1271 0.093903

164

132

9.11414e005 7.93512e007 2.22346 0.220091

1125

150

3.65099e 005 1.10631e006 0.336198 0.0338185

Table 4

Errors of the classical finite element solutions.

h CPU (s) itF Jruuh J0,OJruJ0,O

JpphJ0,OJpJ0,O

JphJL1 O Rate

127

3 .8 78 3 0 .00 40 22 24 0 .0 005 05 81 2 .9 69 37 e011

164

24.354 3 0.000717938 8.98303e005 2.78859e014 1.99677

1125

118.539 3 0.000188292 2.35458e005 3.49921e010 1.99931



5/18

first solution to determine the pressure uniquely. From Algorithm

1, we can see that both the coarse grid approximation pH and the

fine grid corrections Zh,j j 1,2, . . . ,m have a vanishing averageon their respective solution domains, i.e., both

ROpH dx 0 andR

OjZh,j dx 0 j 1,2, . . . ,m are enforced. However, due to the

overlapping of sub-domainsOj j 1,2, . . . ,m, Algorithm 1 cannot

guarantee that the final result ph is really in L20O or Iph is small

enough to ensure that ph is an acceptable approximation of the

exact solution. In other words, Algorithm 1 cannot guarantee that

for j 1,2, . . . ,m, Zh,jAM0h Oj at Step 2 is exactly the local

correction of pH obtained at Step 1 in the subregion Dj; it may

be the correction of another coarse grid approximation of the

pressure. If this is the case, the approximate solution ph obtained

from Algorithm 1 may be far away from the exact solution.

Consequently, the accuracy of the approximate pressure obtained

from Algorithm 1 depends not only on the coarse grid size H(or, equivalently, the coarse grid solution pH), but also on whether

the fine grid corrections Zh,js j 1,2, . . . ,m at Step 2 are exactlythe corrections of the coarse grid solution pH in the disjoint

sub-domains.

3.2. New parallel finite element algorithm

Our new parallel finite element algorithm is motivated by the

above analysis and observation. We just modify Step 2 of Algo-

rithm 1 to more precisely calculate the corrections eh,j,Zh,j on theoverlapped sub-domains Oj j 1,2, . . . ,m. On one hand, unlike

Algorithm 1, we confine the pressure correction Zh,j in spaceL2O

j

\R by adding a homogeneous boundary condition on the

artificial boundary @Oj\@O of sub-domains Oj j 1,2, . . . ,m in the

fine grid local correction problems. On the other hand, we solve a

fully nonlinear correction problem by an iterative method such as

Newton and Picard iterations (see, e.g., [45,46]) independently on

0.4 0.2 0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

y

Re=100

Present (H=1/32, h=1/64)

Present (H=1/64, h=1/128)

Classical FEM (h=1/128)

Ghia et al. (h=1/128)

0.4 0.2 0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

y

Re=100

Present (H=1/32, h=1/64)

Present (H=1/64, h=1/128)



0 0.2 0.4 0.6 0.8 10.3

0.25

0.2

0.15

0.1

0.05

0

0.05

0.1

0.15

0.2

x

u2velocity

Re=100

Present (H=1/32, h=1/64)

Present (H=1/64, h=1/128)



0 0.2 0.4 0.6 0.8 10.3

0.25

0.2

0.15

0.1

0.05

0

0.05

0.1

0.15

0.2Re=100

u2velocity

x

Present (H=1/32, h=1/64)

Present (H=1/64, h=1/128)



u1velocity u1velocity

Fig. 4. Comparison of u1-velocity profiles along the vertical centerline (top) and u2-velocity profiles along the horizontal centerline (bottom) for lid-driven cavity flow at

Re 100: (a) 2 2 sub-domains; (b) 4 4 sub-domains.

u1=1, u2=0

u1=0, u2=0

u1=0,

u2=0

u1=0,

u2=0

L = 1

L=1

Fig. 3. Schematic diagram of the lid-driven cavity flow.



6/18

sub-domains Oj j 1,2, . . . ,m. Specifically, we first approximate

the low frequency components of the solution to the NavierStokes

equations using a coarse grid on the entire domain as done in

Algorithm 1, and then use a fine grid to correct the resultant

residual in parallel on a collection of overlapped sub-domains,

where the local problems for these fine grid corrections are fully

nonlinear with homogeneous boundary conditions for the velocity

on all boundaries of the overlapped sub-domains and homoge-

neous conditions for the pressure only on the artificial boundaries.All of these nonlinear correction problems are solved in parallel by

an iterative method such as Newton and Picard iterations.

Setting

MGjh

Oj fqAMhOj : qjGj 0g, Gj @Oj\@O, 3:3

our new algorithm with Newton iteration for the nonlinear

correction problems reads:

Algorithm 2. New parallel finite element algorithm.

1. Find a global coarse grid solution uH,pHAX0HO M

0HO such

that

auH

,v buH

,uH

,vdv,pH

f,v, 8vAX0

HO,

duH,q 0, 8qAM0HO:

2. Find fine grid corrections eh,j,Zh,jAX0h Oj M

Gjh Oj j 1,2,

. . . ,m in parallel by the following iterative procedure:

aenh,j,v benh,j,e

n1h,j ,v be

n1h,j ,e

nh,j,vdv,Z

nh,j

ben1h,j ,en1h,j ,v Rj,v, 8vAX

0h Oj,

denh,j,q duH,q, 8qAMGjh

Oj, 3:4

for n 1,2, . . ., where the initial guess e0h,j 0 for j 1,2, . . . ,m.

3. Set uh,ph uH,pH eh,j,Zh,j in Dj j 1,2, . . . ,m.

Remark 3.2. In our new algorithm, we add zero restriction on the

artificial boundaries of sub-domains in the local correction

problems. It is noted that similar boundary conditions were used

in [4749] for the incompressible Stokes and NavierStokes

equations, respectively. Such a restriction does not lead to

singular problems because the zero Dirichlet boundary condition

for the pressure enforces a unique pressure solution.

Remark 3.3. Step 2 of the above new algorithm is the Newton

iterative method applied to the following local residual:

aeh,j,v beh,j,eh,j,vdv,Zh,j Rj,v, 8vAX0h Oj,

deh,j,q duH,q, 8qAMGjh Oj: 3:5

0.4 0.2 0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

y

Re=1000

Present (H=1/32, h=1/64)

Present (H=1/64, h=1/128)



0.4 0.2 0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

y

Re=1000

Present (H=1/32, h=1/64)

Present (H=1/64, h=1/128)



0 0.2 0.4 0.6 0.8 10.6

0.5

0.4

0.3

0.2

0.1

0

0.1

0.2

0.3

0.4

x

u2velocity

Re =1000

Present (H=1/32, h=1/64)

Present (H=1/64, h=1/128)



0 0.2 0.4 0.6 0.8 10.6

0.5

0.4

0.3

0.2

0.1

0

0.1

0.2

0.3

0.4

x

u2velocity

Re =1000

Present (H=1/32, h=1/64)

Present (H=1/64, h=1/128)





Re1000: (a) 2 2 sub-domains; (b) 4 4 sub-domains.



7/18

We can also employ other linearization methods to solve the

nonlinear correction problem (3.5). For example, the Picard

iterative method (see, e.g., [45,46]) applied to problem (3.5) reads:

aenh,j,v ben1h,j ,e

nh,j,vdv,Z

nh,j Rj,v, 8vAX

0h Oj,

denh,j,q duH,q, 8qAMGjh

Oj, 3:6

for n 1,2, . . ..

Remark 3.4. As one of the referees pointed out that the correc-

tions in the velocity and pressure fields can be viewed as

approximations of the discretization errors between the solutions

computed on the two different meshes (see (3.5) and (2.11),

respectively). This is, in a way, related to the residual-type

methods for a posteriori error estimation in finite element

analysis (cf. [5052]). We refer, for example, to [5355] for

such residual-type a posteriori error estimations for the steady

NavierStokes equations, and to [5659] for the unsteady

NavierStokes equations. However, the main philosophy behind

our present paper is that we should treat phenomena of different

scales by different tools [11], which is different from that of a

posteriori error estimation.

Remark 3.5. The approximation uh,ph obtained from our Algo-

rithm is piecewise defined. It is in general discontinuous. In the

case Di \ Dja| iaj, on the interface, we can simply take the

average of the two subdomains solutions as its solution (this

strategy was used in our numerical experiments). To obtain a

global continuous approximation, one can use an additional local

fine grid problem to smooth the solution uh,ph as done in [11].

For j 1,2, . . . ,m, defining

JRjJ1,Oj supvAH1

0Oj

d,

va0

jRj,vOj j

JrvJ0,Oj, 3:7

Nj supu,v,wAH1

0Oj

d,

u,v,wa0

jbu,v,wj

JruJ0,OjJrvJ0,OjJrwJ0,Oj, 3:8

we have the following error estimate for our new parallel algorithm.

Theorem 3.2. Suppose that the conditions of Theorem 3.1 are valid

and the following stability conditions hold:

25Nj3n2

JRjJ1,Ojo1, j 1,2, . . . ,m: 3:9

Then the approximate solution uh,ph obtained from Algorithm 2 has

the following error estimate:

JjruuhJj0,OJjpphJj0,O

rchs Hs 1JuJs 1,O JpJs,O, 1rsrt:

Proof. From Lemmas 4.2 and 5.2 in [46], we obtain that, under

the stability condition (3.9), the iterative procedure (3.4) is stable

0.5 0 0.5 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

y

Re=5000

Present (H=1/32, h=1/64)

Present (H=1/64, h=1/128)



0.5 0 0.5 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

y

Re=5000

Present (H=1/32, h=1/64)

Present (H=1/64, h=1/128)



0 0.2 0.4 0.6 0.8 10.8

0.6

0.4

0.2

0

0.2

0.4

0.6Re=5000

u2velocity

x

Present (H=1/32, h=1/64)

Present (H=1/64, h=1/128)



0 0.2 0.4 0.6 0.8 10.8

0.6

0.4

0.2

0

0.2

0.4

0.6

x

u2velocity

Re=5000

Present (H=1/32, h=1/64)

Present (H=1/64, h=1/128)





Re 5000: (a) 2 2 sub-domains; (b) 4 4 sub-domains.



8/18

and convergent for all j 1,2, . . . ,m. By a similar argument as that

used in the proof of Theorem 4.2 in [9] and Theorem 3.2 in [37],

we can easily finish the proof. &

Remark 3.6. A fully nonlinear problem on the coarse grid needs to

be solved both in Algorithms 1 and 2. We usually solve this nonlinear

NavierStokes problem using either the Newton method or the

Picard method (see, e.g., [45,46]). From the definitions of

JfJ1,O, JRjJ1,Oj

,N and Nj j 1,2, . . . ,m (see (2.3), (3.7), (2.4)

and (3.8), respectively), we see that when the Newton iterative

method (which needs the stability condition 25NJfJ1,O=3n2o1;

see [46]) is employed to solve the coarse grid problem, the stability

conditions (3.9) are apparently valid. Therefore, no stricter conditions

than those of Algorithm 1 are required for our new Algorithm 2.

Throughout this paper, we assume that the nonlinear problems are

uniquely solvable by the above mentioned iterative methods and the

corresponding conditions for these methods hold.

Comparing Algorithm 2 with Algorithm 1, we can see that the

difference between the two algorithms lies in Step 2. First, unlike

Algorithm 1 where the correction problems are linear, the local

correction problems in our new algorithm are nonlinear. Second,

our new algorithm applies a homogeneous boundary condition

for pressure on the artificial boundary @Oj\@O of sub-domainsOj j 1,2, . . . ,m in the nonlinear correction problems. The

homogeneous boundary condition on the artificial boundaries of

overlapped sub-domains for the pressure ensures that in

Dj j 1,2, . . . ,m, the computed results Zh,j j 1,2, . . . ,m are

exactly the corrections of pH and hence the final result ph is in

L20O or has a small value of Iph .

From Algorithm 2 we can see that our new parallel algorithm is

based on a global coarse grid nonlinear problem and local fine grid

nonlinear problems. There is no communication between processors

in the solving process of the local correction problems. If we allow all

processors to simultaneously compute the coarse grid solution, our

algorithm only requires an existing sequential solver as sub-problem

solver and hence allows existing sequential PDE codes to run in a

parallel environment with a little investment in recoding: given an

existing or black-box sequential NavierStokes equations solver, our

algorithm only requires the application of the solver on overlapped

sub-domains and its application on a global coarse mesh. This is a

very attractive feature of our algorithm.

4. Numerical results

In this section, we shall report some numerical results to

demonstrate the efficiency of our new parallel algorithm. The testcases include a simple problem with known analytical solution,

the lid-driven cavity flow, the backward-facing step flow, and the

0.5 0 0.5 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

y

Re=7500

Present (H=1/32, h=1/64)

Present (H=1/64, h=1/128)


0.5 0 0.5 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

y

Re=7500

Present (H=1/32, h=1/64)

Present (H=1/64, h=1/128)


0 0.2 0.4 0.6 0.8 10.8

0.6

0.4

0.2

0

0.2

0.4

0.6

x

u2velocity

Re=7500

Present (H=1/32, h=1/64)

Present (H=1/64, h=1/128)


0 0.2 0.4 0.6 0.8 10.8

0.6

0.4

0.2

0

0.2

0.4

0.6

x

u2velocity

Re=7500

Present (H=1/32, h=1/64)

Present (H=1/64, h=1/128)




Re7500: (a) 2 2 sub-domains; (b) 4 4 sub-domains.



9/18

flow past a circular cylinder. The routine UMFPACK [60] is used to

solve the linear systems arising from each nonlinear iteration.

In all the numerical experiments, the second-order Taylor-Hood

elements are used for the finite element discretization.

4.1. Analytical solution

In this test case, O is the unit square 0,1 0,1 in R2. we set f

and the boundary conditions such that the exact solution of thestationary NavierStokes equations is given by

u1 sin2

pxsin2py,

u2 sin2pxsin2

py,

p cospx:

The mesh consists of triangular elements which are obtained by

dividingO (or Oj, j 1,2, . . . ,m) into sub-squares of equal size and

then drawing the diagonal in each sub-square; see Fig. 1 (left).

We divide O 0,1 0,1 into four disjoint subdomains

D1 0,12 0,

12 , D2 0,

12

12,1,

D3 12 ,1 0,12 , D4 12 ,1 12,1,

and then extend each sub-domain Dj j 1,2,3,4 outside with an

extra layer of size h to obtain Oj j 1,2,3,4; see Fig. 1(right).

These Ojs are composed of an overlapping decomposition of O.

We compute the finite element solutions on sub-domains

Oj j 1,2,3,4 independently by using Algorithms 1 and 2,

respectively. The coarse grid nonlinear problem is solved by

Newton iterative method and convergence is achieved when the

relative L2-error of the successive iterative velocities is within a

fixed tolerance of 106, i.e., the following condition is satisfied:

Jun 1H unHJ0,O

Jun 1H J0,Oo10

6,

4:1

where un 1H is the n1-th iterative solution. In our new Algo-

rithms 2, the stopping criterion for the local nonlinear correction

problems on Oj j 1,2, . . . ,m is

Jen 1h,j enh,jJ0,Oj

Jen 1h,j J0,Ojo106: 4:2

We set n 0:1 and compute the finite element solutions withfine meshes of size h n3 n 3,4,5 and corresponding coarse

meshes of size H satisfying 2H3 h2. The numerical results are

listed in Tables 1 and 2, respectively, where the CPU time is the

maximum of CPU time taken by the algorithms over the four

overlapped sub-domains, which includes the mesh generation time,the time spent on solving problems both on coarse and fine grids,

and the error computing time. itC stands for the nonlinear iterations

count satisfying the stopping criterion (4.1) for the coarse grid

0.5 0 0.5 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

y

Re=10000

Present (H=1/32, h=1/64)

Present (H=1/64, h=1/128)


0.5 0 0.5 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

u1velocity

y

Re=10000

Present (H=1/32, h=1/64)

Present (H=1/64, h=1/128)


0 0.2 0.4 0.6 0.8 10.8

0.6

0.4

0.2

0

0.2

0.4

0.6

x

u2velocity

Re = 10000

Present (H=1/32, h=1/64)

Present (H=1/64, h=1/128)

Ghia et al. (h= /256)

0 0.2 0.4 0.6 0.8 10.8

0.6

0.4

0.2

0

0.2

0.4

0.6

x

u2velocity

Re=10000

Present (H=1/32, h=1/64)

Present (H=1/64, h=1/128)


u1velocity


Re 10 000: (a) 2 2 sub-domains; (b) 4 4 sub-domains.



10/18

X

Y

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

X

Y

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

X

Y

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

10.90.80.70.60.50.40.30.2

0.10.060.040.020-0.02-0.05-0.1

X

Y

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

10.90.80.70.60.50.40.30.2

0.10.060.040.020-0.02-0.05-0.1

Fig. 9. Computed streamlines (top) and isobars (bottom) for lid-driven cavity flow at Re1000: (a) 2 2 sub-domains; (b) 4 4 sub-domains.

X

Y

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

X

Y

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

X

Y

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1 0.65

0.60.550.50.450.40.350.30.250.20.150.090.0750.0650.050.040.020-0.02-0.03-0.05

X

Y

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1 0.65

0.60.550.50.450.40.350.30.250.20.150.090.0750.0650.050.040.020-0.02-0.03-0.05

Fig. 10. Computed streamlines (top) and isobars (bottom) for lid-driven cavity flow at Re 5000: (a) 2 2 sub-domains; (b) 4 4 sub-domains.



11/18

problem, while itF is the maximum of iterations counts satisfying the

stopping criterion (4.2) for the fine grid local nonlinear correction

problems for our new algorithm. Iph is defined by (3.2). The

convergence rates with respect to mesh parameter h are computed

by the formula logEi=Ei 1=loghi=hi 1, where Ei and Ei 1 are

the relative errors JjruuhJj0,O JjpphJj0,O=JruJ0,O JpJ0,O

corresponding to the fine meshes of sizes hi and hi 1, respectively.

According to the mixed finite element spaces we choose and

the relationship between the mesh sizes Hand h, i.e., HOh2=3

,from Theorems 3.1 and 3.2, we have

JjruuhJj0,OJjpphJj0,O % ch

2:

The results shown in Tables 1 and 2 support the above estimate

both for Algorithm 1 and our new Algorithm 2; see Fig. 2.

However, from Table 1 we can see that the computed results

for the pressure by Algorithm 1 are inaccurate. Although both

the coarse grid solution pH and the local fine grid corrections

Zh,j j 1,2,3,4 are of average-vanishing on O and Oj j 1,2,3,4,respectively, the accuracy of the pressure is very poor and the

values of Iph are far from zero; this is predicted by our analysis in

Section 3.1. While from Table 2, we can see that with a homo-

geneous condition on the artificial boundaries of sub-domains for

the pressure corrections and by several nonlinear iterations forthe local correction problems, our new algorithm yields a reason-

able approximate solution.

To investigate the contributions of the modification strategies

(i.e., the zero restriction of pressure on the artificial boundaries

and the nonlinear version of the corrections) to the improvement

on the approximations of pressure, we computed the finite

element solutions with each strategy separately. Numerical

results listed in Table 3 show that the improvement on the

approximations of pressure mainly results from the zero restric-

tion of pressure on the artificial boundaries, which verifies our

previous analysis in Section 3.1.

Comparing Table 1 with Table 2, we can see that our new

algorithm has much better performance than Algorithm 1. As for

the CPU time, our new algorithm spends a little more thanAlgorithm 1. However, compared to the classical finite element

method, our new algorithm saves a large amount of computa-

tional time with a very comparable accuracy for the solutions; see

Tables 2, 4 and Fig. 2, respectively.

4.2. Lid-driven cavity flow

For this test case, we consider the 2D lid-driven cavity flow

which is a well-known benchmark problem and numerically

investigated by many researchers (cf. [6163]). This problem is

defined in the unit square. With zero source external force,

velocities are zero on all boundaries except the top one

(the lid), which has the driving horizontal velocity set to unity;

see Fig. 3. The Reynolds number for this problem is defined asRe UL=n, where Uis the velocity of the top lid and L is the lengthof the side wall.

For the 2D lid-driven cavity flow problem, it is well documen-

ted that to ensure the convergence of the iterative method used

for the nonlinear NavierStokes system so as to generate an

X

Y

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

X

Y

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

X

Y

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

10.60.550.5

0.450.40.350.30.250.20.150.090.0750.0650.050.040.020-0.02-0.03-0.05

X

Y

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

10.60.550.5

0.450.40.350.30.250.20.150.090.0750.0650.050.040.020-0.02-0.03-0.05

Fig. 11. Computed streamlines (top) and isobars (bottom) for lid-driven cavity flow at Re7500: (a) 2 2 sub-domains; (b) 4 4 sub-domains.



12/18

approximate solution, fine enough meshes are necessary as the

Reynolds number increases. For example, based on the velocity

pressure formulation of the NavierStokes equations, Layton et al.

[64] reported that at Re3200, the classical finite element

method combined with a continuation method failed to converge

on a 31 31 grid mesh. Using the classical finite element method,

X

Y

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

X

Y

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

X

Y

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

0.60.550.50.450.40.350.30.250.20.150.090.0750.0650.050.040.020-0.02-0.03-0.05

X

Y

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

0.60.550.50.450.40.350.30.250.20.150.090.0750.0650.050.040.020-0.02-0.03-0.05

Fig. 12. Computed streamlines (top) and isobars (bottom) for lid-driven cavity flow at Re10 000: (a) 2 2 sub-domains; (b) 4 4 sub-domains.

(0, 0.5)

(0, -0.5) u1 = u2 = 0

u1 = u2 = 0

(30, -0.5)

(30, 0.5)

u1 = 24y (0.5 - y)

u2 = 0

u1 = u2 = 0

-p + u1x

u2 = 0

= 0

Fig. 13. Schematic diagram of the backward-facing step flow.

0.2 0 0.2 0.4 0.6 0.8 1 1.20.5

0.4

0.3

0.2

0.1

0

0.1

0.2

0.30.4

0.5

yPresent (x=7)

Present (x=15)

Gartling (x=7)

Gartling (x=15)

20 15 10 5 0 50.5

0.4

0.3

0.2

0.1

0

0.1

0.2

0.30.4

0.5

y

Present (x=7)Present (x=15)

Gartling (x=7)

Gartling (x=15)

0.

16

0.

17

0.

18

0.

19

0.

2

0.

21

0.

22

0.

23

0.

24

0.

25

0.5

0

0.5

pressure

y

Present (x=7)

Present (x=15)

Gartling (x=7)

Gartling (x=15)

u1velocity u2velocity x 103

Fig. 14. Comparison of u1-velocity (left), u2-velocity (middle) and pressure (right) at various downstream locations for backward-facing step flow at Re800.



13/18

Wang [65] was just able to compute the solution at Reynolds

numbers up to Re5000 on a 81 81 uniform grid mesh. Based

on the stream function-vorticity formulation of the NavierStokes

equations, using pseudo-time derivations and a finite difference

method, Ertural et al. [62] reported that they could not get a

steady solution at Re7500 on a 129 129 grid mesh; while

using a finer 257 257 grid mesh, they were able to obtain a

steady solution at Reynolds numbers up to 12 500.

In our algorithm, a nonlinear NavierStokes problem needs to

be solved both on the coarse and fine grids. In view of the above

remarks, to ensure that a coarse grid solution can be obtained at

high Reynolds numbers, we incorporate our parallel method with

the defectcorrection method (cf. [64,65]) which can yield an

approximate solution on a relatively coarse grid compared to the

classical finite element method. The defectcorrection method

consists of an initial defect step followed by serval correction

0 5 10 15 20 25 300

0.05

0.1

0.15

0.2

0.25

x

y

Lower wall

Upper wall

0 5 10 15 20 25 300.015

0.01

0.005

0

0.005

0.01

x

y

Lower wall

Upper wall

Fig. 15. Pressure (left) and shear stress (right) profiles along upper and lower channel walls for backward-facing step flow at Re800.

Table 5

Comparison of the normalized (by the step height) length (Lm) of the main recirculation region

downstream the step, the separation location (Xs), the reattachment location (Xr) and the length

Ls XrXs of the second recirculation region on the upper wall for the backward-facing step flow

at Re 800.

Reference Lm Xs Xr Ls XrXs

Gartling [66] 12.20 9.70 20.96 11.26

Erturk [67] 11.83 9.48 20.55 11.07

Barton [68] 12.03 9.64 20.96 11.32

Keskar and Lyn [69] 12.19 9.71 20.96 11.25

Grigoriev and Dargush [70] 12.18 9.70 20.94 11.24

Present 12.15 9.67 20.90 11.23

Re = 100

Re = 500

Re = 800

Re = 1000

Fig. 16. Computed streamlines for backward-facing step flow at various Reynolds numbers.



14/18

steps. In the defect step, an artificial viscosity parameter aOhis added to the kinematic viscosity as a stability factor, and the

system is then anti-diffused in the correction steps; see, for

example, [64,65] for details.

We compute the solutions by our new parallel algorithm on

uniform meshes of sizes H 132 ,h 1

64 and H1

64 ,h 1

128, respec-

tively, and compare our computed results with those obtained by the

classical finite element method and those of Ghia et al. [61] where

the computations were based on the vorticity-stream functionformulation of the NavierStokes equations and using the coupled

strongly implicit multigrid method. The nonlinear problems both on

coarse and fine grids are solved by the Picard iterative method

combined with the defectcorrection method, where the stability

factor a is chosen as a 0:05h and three-step corrections (within thedefectcorrection method) are involved. The corresponding stopping

criterion for the nonlinear iterations is that the relative L2-error of

two successive iterates of velocity is within a fixed tolerance of 106.

We compute an approximate solution at Re 100,1000,5000,

7500 and 10 000 for the lid-driven cavity flow with 2 2 and 4 4

sub-domains, respectively, where the overlapped sub-domains are

constructed by extending each disjoint sub-domain outside with an

extra layer of size h. Figs. 48 plot the computed u1 component of

velocity along the vertical centerline and u2

component of velocity

along the horizontal centerline, compared with those of Ghia et al.

[61], where much finer 129 129 (for Re100,1000) and 257 257

(for Re 5000,7500,10 000) grid meshes were used, and those

obtained by the classical finite element method on a uniform mesh

of size h 1128. It is worth mentioning that at Re7500 and 10 000,

the classical finite element method is not able to yield an approx-

imate solution since the iterations for the nonlinear system do not

converge. From Figs. 48 we can see that the accuracy of the

computed solutions is comparable to those of Ghia et al. [61] andthe classical finite element solutions. As expected, the computed

results on grids of sizes H 164 ,h 1

128 are better than those of

H 132 ,h 1

64. Figs. 912 depict the numerical streamlines and isobars

computed by our new algorithm with H 164 ,h 1

128 and a 0:08 h.

4.3. Backward-facing step flow

In this example, we consider the 2D backward-facing step flow

which is a significant test problem for validating the robustness of a

NavierStokes solver. The literature offers many numerical and

experimental studies on 2D steady incompressible flows over a

backward-facing step. Flow features are known to depend on the

Reynolds number, the boundary conditions and the geometrical

parameters such as the step height and the channel height.

Re = 100

Re = 500

Re = 800

Re = 1000

Fig. 17. Computed isobars for backward-facing step flow at various Reynolds numbers.

100 200 300 400 500 600 700 800 9001000

2

4

6

8

10

12

14

Lm

Re

Present

Erturk

500 550 600 650 700 750 800 850 900 9501000

5

6

7

8

9

10

11

12

13

14

15

Re

Ls

Present

Erturk

Fig. 18. Normalized length (Lm) of the main recirculation region downstream the step (left) and the normalized length (Ls) of the second recirculation region on the upper

wall (right) with respect to the Reynolds number for the backward-facing step flow.



15/18

The problem we consider here is defined on a long channel

0,30 0:5,0:5, with no-slip conditions imposed on the top and

bottom walls, as well as the lower half of the left boundary. At the

inlet boundary, a fully developed parabolic velocity profile

u1 24y0:5y for 0ryr0:5 is specified, which leads to a

maximum inflow velocity of umax 1:5 and an average inflow

velocity of uave 1:0. The outlet boundary condition is set as

p n@u1=@x 0. See Fig. 13 for detailed geometry and boundary

conditions information. The Reynolds number for this problem isdefined as Re UaveL=n, where Uave 1 is the average velocity atthe inlet boundary and L 1 is the channel height. An interesting

feature of this problem is that the length of the recirculation zone

downstream the step is proportional (approximately) to the Rey-

nolds number.

We decompose the flow domain into 5 1 disjoint sub-domains

of equal size, and then extend each sub-domain outside with an

extra layer of size h. The quasi-uniform meshes sizes are set asH 132 ,h

164. First, we compute the approximate solution at

Re800 by our new parallel algorithm. In Fig. 14, the computed

velocity and pressure across the channel at x7 and 15 are

compared with those of Gartling [66]. From Fig. 14 we can see that

for the horizontal velocity and pressure, our numerical results agree

well with those of Gartling [66]. While for the vertical velocity, there

is a very little difference at x7. It is noted that due to the different

solutions to uniquely determining the approximate pressure, our

computed pressure is not the same as Gartlings [66]; there is a

constant difference between them. For the sake of comparison, our

pressure data presented in Fig. 14 were adjusted by making the

computed pressure equal to that of Gartling [66] at the lower

channel wall point x,y 7:0,0:5. Fig. 15 describes the computed

pressure and shear stress along the upper and lower channel walls,

which are also in perfect agreement with those of Gartling [66].

In Table 5, we compare the normalized (by the step height) length(Lm) of the main recirculation region downstream the step, the

separation location (Xs), the reattachment location (Xr) and the length

Ls XrXs of the second recirculation region on the upper wall

obtained by our new algorithm with those in the literature [6670].

The good agreement indicates the accuracy of our new algorithm.

Figs. 16 and 17 depict the computed streamlines and isobars at

different Reynolds numbers, respectively, where the vertical y-scale

is expanded in order to be able to see the details. Fig. 16 clearly

shows that the length of the main recirculation region downstream

the step increases as the Reynolds number grows. At Re500, a

second recirculation eddy forms on the upper wall, which becomes

Fig. 20. Nonoverlapping (left) and overlapping (right) domain decomposition for the flow past a circular cylinder.

-5 0 5 10 15 20-10

-5

0

5

10

-1 0 1 2

-1

-0.5

0

0.5

1

Fig. 21. The coarse grid for the flow past a circular cylinder: full (left) and zoom-in (right) view.

Table 6

Comparison of the separation angle y and wake length (Lw) for the flow past acircular cylinder at Re 10,20,40.

Re Reference y Lw

10 Dennis and Chang [71] 29.6 0.265

Ding et al. [72] 30.0 0.252

Kim et al. [73] 29.5 0.281

Present 29.8 0.257


Fornberg [74] 0.91

Ding et al. [72] 44.1 0.93

Kim et al. [73] 43.7 0.91

Present 43.7 0.937


Fornberg [74] 2.24

Ding et al. [72] 53.5 2.20

Kim et al. [73] 55.1 2.187

Present 53.4 2.258Fig. 19. Schematic diagram of the flow past a circular cylinder.



16/18

longer as the Reynolds number increases further. Fig. 18 depicts the

normalized length (Lm) of the main recirculation region downstream

the step and the normalized length (Ls) of the second recirculation

region on the upper wall with respect to the Reynolds number

compared with those of Erturk [67]. Considering the different grid

meshes, the different outflow locations and boundary conditions, the

results are in good agreement.

4.4. Flow past a circular cylinder

A circular cylinder of radius r0.5 resides in a rectangular domain

5,20 10,10, where the center of the circular cylinder is located

at the origin. A uniform flow with free-stream velocity U1 coming

from the left far field passes around the circular cylinder; see Fig. 19. A

no-slip boundary condition is specified on the surface of the cylinder,

-2 -1 0 1 2 3 4-2

-1

0

1

2

-2

-1

0

1

2

-2 -1 0 1 2 3 4-2

-1

0

1

2

-2 -1 0 1 2 3 4-2

-1

0

1

2

-2 -1 0 1 2 3 4-2

-1

0

1

2

-2 -1 0 1 2 3 4-2

-1

0

1

2

-2 0 2 4-2

-1

0

1

2

-2 0 2 4-2

-1

0

1

2

Fig. 22. Computed streamlines (left) and isobars (right) for the flow past a circular cylinder at Re 5,

10,

20,

40 (from top to bottom).



17/18

while on the inflow boundary, on the outflow boundary and on the

upper and lower wall boundaries, a potential flow velocity

u u1,u2 U1r2=x2 y2 2r2y2=x2 y22,2r2xy=x2 y22

is prescribed. The Reynolds number based on the free-stream velocity

U (here U1) and the cylinder diameter D (here D1) is defined asUD=n. It is well known that the stationary and symmetric flow past acircular cylinder becomes unstable for values of the Reynolds number

greater than 40, in which case the flow becomes periodic and

unsymmetricWe decompose the domain into six disjoint sub-domains, and

then enlarge each sub-domain by extending outside an extra layer

of size 0.5; see Fig. 20. The meshes sizes are H 12 ,h 14 with a

local refinement around the cylinder; see Fig. 21 for the coarse

grid where 5762 vertices are involved. In Table 6, we tabulated

the separation angle y and the length of the wake behind the

cylinder obtained by our new algorithm together with those in

the literature [7174], where good agreement is observed. The

computed streamlines and isobars around the cylinder are also

plotted in Fig. 22.

5. Conclusions

In this work we have proposed a new parallel finite elementalgorithm for the stationary NavierStokes equations. It is based

on a coarse grid nonlinear problem and local fine grid nonlinear

correction problems defined on overlapped sub-domains, and

hence allows existing sequential PDE codes to run in a parallel

environment without extensive recoding. Numerical simulations

of the lid-driven cavity flow, the backward-facing step flow and

the flow past a circular cylinder demonstrated the efficiency of

the proposed algorithm.

Acknowledgments

The authors thank the editor and reviewers for their valuable

comments and suggestions which led to a large improvement ofthe paper.

This work was supported by the National Research Foundation

(NRF) Grant funded by the Korean Government (MEST) (No. 2010-

0017532), the Natural Science Foundation of China (No. 11001061,

10971166), the National High Technology Research and Develop-

ment Program of China (863 Program: 2009AA01A135) and the

Ph.D. Research-Starting Foundation of Guizhou Normal University,

China ([2010] Parallel Algorithms for Computational Fluid

Dynamics Problems).

References

[1] A. Toselli, O. Widlund, Domain Decomposition Methods: Algorithms and

Theory, Springer, Berlin, 2005.[2] H. Elman, V.E. Howle, J. Shadid, et al., A taxonomy and comparison of parallelblock multi-level preconditioners for the incompressible NavierStokesequations, J. Comput. Phys. 227 (2008) 17901808.

[3] S. Behara, S. Mittal, Parallel finite element computation of incompressibleflows, Parallel Comput. 35 (2009) 195212.

[4] C.A. Rivera, M. Heniche, R. Glowinski, P.A. Tanguy, Parallel finite elementsimulations of incompressible viscous fluid flow by domain decompositionwith Lagrange multipliers, J. Comput. Phys. 229 (2010) 51235143.

[5] Y.Q. Shang, Y.N. He, Parallel finite element algorithms based on full domainpartition for stationary Stokes equations, Appl. Math. Mech.Engl. Ed. 31 (5)(2010) 643650.

[6] Y.Q. Shang, Y.N. He, Parallel iterative finite element algorithms based on fulldomain partition for the stationary NavierStokes equations, Appl. Numer.Math. 60 (7) (2010) 719737.

[7] Y.Q. Shang, A parallel two-level linearization method for incompressible flowproblems, Appl. Math. Lett. 24 (2011) 364369.

[8] Y.N. He, L.Q. Mei, Y.Q. Shang, J. Cui, Newton iterative parallel finite elementalgorithm for the steady NavierStokes equations, J. Sci. Comput. 44 (1)

(2010) 92106.

[9] Y.N. He, J.C. Xu, A.H. Zhou, Local and parallel finite element algorithms for theNavierStokes problem, J. Comput. Math. 24 (3) (2006) 227238.

[10] F.Y. Ma, Y.C. Ma, W.F. Wo, Local and parallel finite element algorithms basedon two-grid discretization for steady NavierStokes equations, Appl. Math.Mech.Engl. Ed. 28 (1) (2007) 2735.

[11] J.C. Xu, A.H. Zhou, Local and parallel finite element algorithms based on two-grid discretizations, Math. Comput. 69 (2000) 881909.

[12] J.C. Xu, A.H. Zhou, Local and parallel finite element algorithms based on two-grid discretizations for nonlinear problems, Adv. Comput. Math. 14 (2001)293327.

[13] Y.Q. Shang, Y.N. He, Z.D. Luo, A comparison of three kinds of local and parallel

finite element algorithms based on two-grid discretizations for the stationaryNavierStokes equations, Comput. Fluids. 40 (2011) 249257.

[14] W. Layton, A two level discretization method for the NavierStokes equa-tions, Comput. Math. Appl. 5 (26) (1993) 3338.

[15] W. Layton, H.W.J. Lenferink, A multilevel mesh independence principle forthe NavierStokes equations, SIAM J. Numer. Anal. 33 (1) (1996) 1730.

[16] W. Layton, H.K. Lee, J. Peterson, Numerical solution of the stationary NavierStokes equations using a multilevel finite element method, SIAM J. Sci.Comput. 20 (1) (1998) 112.

[17] X.X. Dai, X.L. Cheng, A two-grid method based on Newton iteration for theNavierStokes equations, J. Comput. Appl. Math. 220 (2008) 566573.

[18] Y.N. He, A.W. Wang, A simplified two-level method for the steady NavierStokes equations, Comput. Meth. Appl. Mech. Engrg. 197 (2008) 15681576.

[19] H. Abboud, V. Girault, T. Sayah, A second order accuracy for a full discretizedtime-dependent NavierStokes equations by a two-grid scheme, Numer.Math. 114 (2009) 189231.

[20] J.C. Xu, Two-grid discretization techniques for linear and nonlinear PDEs,SIAM J. Numer. Anal. 33 (5) (1996) 17591777.

[21] M. Marion, R. Temam, Nonlinear Galerkin methods, SIAM J. Numer. Anal. 26(5) (1989) 11391157.[22] M. Marion, R. Temam, Nonlinear Galerkin methods: the finite element case,

Numer. Math. 57 (1990) 122.[23] A.A.O. Amni, M. Marion, Nonlinear Galerkin methods and mixed finite

element: two-grid algorithms for the NavierStokes equations, Numer. Math.68 (1994) 189213.

[24] Z.D. Luo, J. Zhu, A nonlinear Galerkin mixed element method and a posteriorierror estimator for the stationary NavierStokes equations, Appl. Math.Mech.Engl. Ed. 23 (10) (2002) 11941206.

[25] B.F. Smith, P.E. Bjrstad, W. Gropp, Domain Decomposition: Parallel Multi-level Methods for Elliptic Partial Differential Equations, Cambridge UniversityPress, Cambridge, 1996.

[26] A. Quarteroni, A. Valli, Domain Decomposition Methods for Partial Differ-ential Equations, Oxford Science Publications, London, 1999.

[27] F.N. Hwang, X.C. Cai, A parallel nonlinear additive Schwarz preconditionedinexact Newton algorithm for incompressible NavierStokes equations, J.Comput. Phys. 204 (2005) 666691.

[28] F.N. Hwang, X.C. Cai, A class of parallel two-level nonlinear Schwarz

preconditioned inexact Newton algorithms, Comput. Meth. Appl. Mech.Engrg. 196 (2007) 16031611.

[29] X.C. Cai, D.E. Keyes, L. Marcinkowski, Nonlinear additive Schwarz precondi-tioners and applications in computational fluid dynamics, Int. J. Numer.Meth. Fluids 40 (2002) 14631470.

[30] R. Adams, Sobolev Spaces, Academic Press Inc, New York, 1975.[31] P.G. Ciarlet, The Finite Element Method for Elliptic Problems, North-Holland,

Amsterdam, 1978.[32] J.G. Heywood, R. Rannacher, Finite element approximation of the nonsta-

tionary NavierStokes problem I: regularity of solutions and second-ordererror estimates for spatial discretization, SIAM J. Numer. Anal. 19 (2) (1982)275311.

[33] R.B. Kellogg, J.E. Osborn, A regularity result for the Stokes problem in aconvex polygon, J. Funct. Anal. 21 (1976) 397431.

[34] R. Temam, NavierStokes Equations: Theory and Numerical Analysis, North-Holland, Amsterdam, 1984.

[35] V. Girault, P.A. Raviart, Finite Element Methods for NavierStokes Equations:Theory and Algorithms, Springer-Verlag, Berlin Heidelberg, 1986.

[36] Y.N. He, J.C. Xu, A.H. Zhou, J. Li, Local and parallel finite element algorithmsfor the Stokes problem, Numer. Math. 109 (3) (2008) 415434.

[37] Y.Q. Shang, Z.D. Luo, A parallel two-level finite element method for theNavierStokes equations, Appl. Math. Mech.Engl. Ed. 31 (11) (2010)14291438.

[38] Y.Q. Shang, K. Wang, Local and parallel finite element algorithms based ontwo-grid discretizations for the transient Stokes equations, Numer. Algor. 54(2) (2010) 195218.

[39] D.N. Arnold, X. Liu, Local error estimates for finite element discretizations ofthe Stokes equations, RAIRO M2AN 29 (1995) 367389.

[40] D.N. Arnold, F. Brezzi, M. Fortin, A stable finite element for the Stokesequations, Calcolo 21 (1984) 337344.

[41] M. Fortin, Calcul numerique des ecoulements fluides de Bingham et desfluides Newtoniens incompressible par des methodes delements finis,Doctoral Thesis, Universite de Paris VI, 1972.

[42] P. Hood, C. Taylor, A numerical solution of the NavierStokes equations usingthe finite element technique, Comput. Fluids 1 (1973) 73100.

[43] M. Crouzeix, P.-A. Raviart, Conforming and nonconforming finite elementmethods for solving the stationary Stokes equations, RAIRO Anal. Numer. 7

(R-3) (1973) 3376.



18/18

[44] L. Mansfield, Finite element subspaces with optimal rates of convergence forstationary Stokes problem, RAIRO Anal. Numer. 16 (1982) 4966.

[45] H.C. Elman, D.J. Silvester, A.J. Wathen, Finite Elements and Fast IterativeSolvers: With Applications in Incompressible Fluid Dynamics, OxfordUniversity Press, Oxford, 2005.

[46] Y.N. He, J. Li, Convergence of three iterative methods based on finite elementdiscretization for the stationary NavierStokes equations, Comput. Meth.Appl. Mech. Engrg. 198 (2009) 13511359.

[47] A. Klawonn, L.F. Pavarino, Overlapping Schwarz methods for mixed linearelasticity and Stokes problems, Comput. Meth. Appl. Mech. Engrg. 165 (1998)233245.

[48] L.F. Pavarino, Indefinite overlapping Schwarz methods for time-dependentStokes problems, Comput. Meth. Appl. Mech. Engrg. 187 (2000) 3551.

[49] F.N. Hwang, Some parallel linear and nonlinear Schwarz methods withapplications in computational fluid dynamics, Ph.D. Dissertation, Universityof Colorado, 2004.

[50] M. Ainsworth, J.T. Oden, A posteriori error estimation in finite elementanalysis, Comput. Meth. Appl. Mech. Engrg. 142 (1997) 188.

[51] M. Ainsworth, J.T. Oden, A Posteriori Error Estimation in Finite ElementAnalysis, John Wiley & Sons, 2000.

[52] T.J. Barth, H. Deconinck, Error Estimation and Adaptive DiscretizationMethods in Computational Fluid Dynamics, Lecture Notes in ComputerScience and Engineering, vol. 25, Springer, 2003.

[53] H. Jin, S. Prudhomme, A posteriori error estimation of steady-state finiteelement solutions of the NavierStokes equations by a subdomain residualmethod, Comput. Meth. Appl. Mech. Engrg. 159 (1998) 1948.

[54] L. Machiels, J. Peraire, A.T. Patera, A posteriori finite element output boundsfor the incompressible NavierStokes equations: application to a naturalconvection problem, J. Comput. Phys. 172 (2001) 401425.

[55] M. Farhloul, S. Nicaise, L. Paquet, A priori and a posteriori error estimationsfor the dual mixed finite element method of the NavierStokes problem,Numer. Meth. Part. Diff. Eq. 25 (4) (2009) 843869.

[56] S. Prudhomme, J.T. Oden, A posteriori error estimation and error control forfinite element approximations of the time-dependent NavierStokes equa-tions, Finite Elem. Anal. Des. 33 (1999) 247262.

[57] J. Cao, Application of a posteriori error estimation to finite element simula-tion of incompressible NavierStokes flow, Comput. Fluids 34 (2005)972990.

[58] J. Hoffman, C. Johnson, A new approach to computational turbulencemodelling, Comput. Meth. Appl. Mech. Engrg. 195 (2006) 28652880.

[59] S. Berrone, M. Marro, Spacetime adaptive simulations for unsteadyNavierStokes problems, Comput. Fluids 38 (2009) 11321144.

[60] T.A. Davis, Available at: hhttp://www.cise.ufl.edu/research/sparse/umfpacki.[61] U. Ghia, K. Ghia, C. Shin, High-Re solutions for incompressible flow using the

NavierStokes equations and a multigrid method, J. Comput. Phys. 48 (1982)

387411.[62] E. Erturk, T. Corke, C. Gokcol, Numerical solutions of 2-D steady incompres-

sible driven cavity flow at high Reynolds numbers, Int. J. Numer. Meth. Fluids

48 (2005) 747774.[63] E. Erturk, Discussions on driven cavity flow, Int. J. Numer. Meth. Fluids 60

(2009) 275294.[64] W. Layton, H. Lee, J. Peterson, A defectcorrection method for the incom-

pressible NavierStokes equations, Appl. Math. Comput. 129 (2002) 119.[65] K. Wang, A new defect correction method for the NavierStokes equations at

high Reynolds numbers, Appl. Math. Comput. 11 (216) (2010) 32523264.[66] D.K. Gartling, A test problem for outflow boundary conditions-flow over a

backward-facing step, Int. J. Numer. Meth. Fluids 11 (1990) 953967.[67] E. Erturk, Numerical solution of 2D steady incompressible flow over a

backward-facing step, part I: high Reynolds number solutions, Comput.

Fluids 37 (2008) 633655.[68] I.E. Barton, The entrance effect of laminar flow over a backward-facing step

geometry, Int. J. Numer. Meth. Fluids 25 (1997) 633644.[69] J. Keskar, D.A. Lyn, Computations of a laminar backward-facing step flow at

Re800 with a spectral domain decomposition method, Int. J. Numer. Meth.

Fluids 29 (1999) 411427.[70] M.M. Grigoriev, G.F. Dargush, A poly-region boundary element method for

incompressible viscous fluid flows, Int. J. Numer. Meth. Eng. 46 (1999)

11271158.[71] S.C.R. Dennis, G.Z. Chang, Numerical solutions for steady flow past a circular

cylinder at Reynolds up to 100, J. Fluid Mech. 42 (1970) 471489.[72] H. Ding, C. Shu, K.S. Yeo, D. Xu, Simulation of incompressible viscous flows

past a circular cylinder by hybrid FD scheme and meshless least square-based finite difference method, Comput. Meth. Appl. Mech. Engrg. 193 (2004)

727744.[73] Y. Kim, D.W. Kim, S. Jun, J.H. Lee, Meshfree point collocation method for the

stream-vorticity formulation of 2D incompressible NavierStokes equations,

Comput. Meth. Appl. Mech. Engrg. 196 (2007) 30953109.[74] B. Fornberg, A numerical study of steady viscous flow past a circular cylinder,

J. Fluid Me ch. 98 (1980) 819855.

http://www.cise.ufl.edu/research/sparse/umfpackhttp://www.cise.ufl.edu/research/sparse/umfpack