7/29/2019 Fem Navier Stokes 1
1/18
A new parallel finite element algorithm for the stationaryNavierStokes equations
Yueqiang Shang a,b,, Yinnian He c, Do Wan Kim a, Xiaojun Zhou b
a Department of Mathematics, Inha University, Incheon, 402-751, Republic of Koreab School of Mathematics and Computer Science, Guizhou Normal University, Guiyang, 550001, PR Chinac Faculty of Science, Xian Jiaotong University, Xian, 710049, PR China
a r t i c l e i n f o
Article history:Received 20 September 2010
Received in revised form
20 May 2011
Accepted 1 June 2011Available online 2 July 2011
Keywords:
NavierStokes equations
Finite element
Parallel computing
Parallel algorithm
Two-grid method
Domain decomposition
a b s t r a c t
Based on two-grid discretization, a new parallel finite element algorithm for the stationaryNavierStokes equations is proposed and analyzed. This algorithm first solves the NavierStokes
equations using a coarse grid, and then corrects the resultant residual on a fine grid by solving local
NavierStokes equations in a parallel manner with homogeneous boundary conditions. Existing
sequential NavierStokes solver is available for each problem on sub-domains, so that the proposed
parallel algorithm can be implemented on the top of existing sequential software. The error bounds of
the approximate solution are estimated. Moreover, the efficiency of the algorithm is also demonstrated
by numerical simulations of the lid-driven cavity flow, the backward-facing step flow, and the flow past
a circular cylinder.
& 2011 Elsevier B.V. All rights reserved.
1. Introduction
Computational fluid dynamics models are in general based on
the solution of the NavierStokes equations and its discretization
scheme, for instance, finite element methods and finite volume
methods. To accurately capture the physical properties of the fluid
flow being simulated, we usually need highly refined meshes on the
entire flow domain which can cause a large scale computation
possibly beyond the capability of a single computer. Therefore, to
utilize the computational power of modern high-performance
parallel computers, much effort is thrown into the development of
efficient parallel computing methods for the NavierStokes equa-
tions and related flow problems (see, e.g., [17]).
Recently, local and parallel algorithms for the stationaryNavierStokes equations were proposed and analyzed in [810],
respectively, based on a new approach to local and parallel finite
element computations [11,12] together with the fact that the global
behavior of a solution to the NavierStokes equations is mostly
dominated by the low frequency components and, on the contrary,
the local behavior is basically affected by high frequency compo-
nents. Such algorithms were numerically compared in [13]. The key
idea of these algorithms is to use the classical finite element
discretization on a coarse grid to approximate the low frequencies,
and then employ linearizations on local fine grids to correct the
resultant residual of high frequencies. Theoretical analysis shows
that these algorithms can yield the same order of convergence rate
as in the classical Galerkin finite element method if appropriate ratio
between coarse mesh size and fine mesh size is taken. However,
although the coarse grid size is suitably chosen in some cases of
incompressible flows, numerical computation showed that the finite
element solutions obtained from these local and parallel algorithms
are inaccurate particularly for the pressure when the overlapping
size of the sub-domains is small.
The objective of this paper is that we employ the local and
parallel finite element computations approach of Xu and Zhou[11,12] to develop an efficient parallel finite element algorithm
for the d-dimensional stationary NavierStokes flows d 2,3.
This novel algorithm is based on a coarse grid finite element
solution to the global NavierStokes equations and fine grid
solutions to local NavierStokes equations defined on overlapped
sub-domains. Here, the nonlinear problems are solved by means
of linearization methods such as Newton and Picard iterations.
Since existing sequential solvers are available for problems on
sub-domains, our method can be easily implemented on top of
the existing sequential software.
It is of worth to mention that similar two-level or multi-level
methods for the NavierStokes equations were proposed in
Contents lists available at ScienceDirect
journal homepage: www.elsevier.com/locate/finel
Finite Elements in Analysis and Design
0168-874X/$- see front matter& 2011 Elsevier B.V. All rights reserved.
doi:10.1016/j.finel.2011.06.001
Corresponding author. Tel.: 82 32 860 8819.
E-mail addresses: [email protected],
[email protected] (Y.Q. Shang), [email protected] (Y.N. He),
[email protected] (D.W. Kim), [email protected] (X.J. Zhou).
Finite Elements in Analysis and Design 47 (2011) 12621279
http://www.elsevier.com/locate/finelhttp://localhost/var/www/apps/conversion/tmp/scratch_9/dx.doi.org/10.1016/j.finel.2011.06.001http://localhost/var/www/apps/conversion/tmp/scratch_9/dx.doi.org/10.1016/j.finel.2011.06.001http://localhost/var/www/apps/conversion/tmp/scratch_9/dx.doi.org/10.1016/j.finel.2011.06.001http://localhost/var/www/apps/conversion/tmp/scratch_9/dx.doi.org/10.1016/j.finel.2011.06.001http://www.elsevier.com/locate/finel7/29/2019 Fem Navier Stokes 1
2/18
[1419] since the pioneering work of Xu [20]. The major differ-
ence between those methods and our method is laid on the fact
that the coarse grid solution is used to linearize the nonlinear
convection term on the finer grid(s) in those methods but in our
method a predictioncorrection-type approach is employed.
Indeed, the solution is first predicted on a coarse grid and then
we correct it by solving the residual equations on the fine grid in a
parallel manner.
Our method is also reminiscent of the nonlinear Galerkinmethods (cf. [2124]). However, there are several essential
differences between the nonlinear Galerkin methods and our
method. First, the velocity is separated into two parts, small and
large eddies components, in the nonlinear Galerkin method.
While in our method, both the velocity and pressure are decom-
posed into low and high frequency components. Second, the
coarse grid solution and the fine grid correction in our method
are uncoupled in the computational process. They are calculated
sequentially, while, in the nonlinear Galerkin methods, such
calculations are coupled together. Third, our method is parallel
computing version. One can expect that a global solver may yield
more accurate solution than our parallel solver. As mentioned
before, however, the amount of storage desired by the global
solver often exceeds the capacity of modern computers.
The current method proposed in this paper also differs from
the classical two-level Schwarz methods (cf. [25,26,1]) in that the
global coarse grid problem and the fine grid local problems need
to be solved only once; moreover, in solving local problems, there
is no communication between processors for our method. It is
also a distinct feature of our method that it is to design a
discretization scheme compared to the methods in [29,27,28],
where the two-level nonlinear methods were used as precondi-
tioners. Moreover, in the present method, the coarse grid problem
does not have to be coupled with the local fine grid problems.
The rest of the paper is organized as follows. In the next section,
the NavierStokes equations and their mixed finite element approx-
imations are provided. In Section 3, based on two-grid finite element
discretization and domain decomposition, a new parallel algorithm
is designed and analyzed. Numerical results on some benchmarkproblems such as the lid-driven cavity flow, the backward-facing
step flow and flow past a circular cylinder are given in Section 4.
Finally, conclusions are drawn in Sections 5.
2. Preliminaries
Let O be a bounded domain with Lipschitz-continuous bound-
ary @O in Rd d 2,3 and satisfy an additional condition stated in
assumption H0 below. As usual, for a nonnegative integer k, we
denote by HkO the Sobolev space of functions with square
integrable distribution up to order k in O, equipped with
the standard norm J Jk,O, while denote by H10 O the closed
subspace of H1O consisting of functions with zero trace on @O,
see, e.g., [30,31]. Throughout this paper, we shall use the letter c
(with or without subscripts) to denote a generic positive constant
which is independent of mesh parameter and may take on
different values on different occurrences.
2.1. The NavierStokes equations
We consider the following incompressible NavierStokes
equations:
nDuu rurp f in O, 2:1a
div u 0 in O, 2:1b
u 0 on @O, 2:1c
where u u1, . . . ,udT is the velocity, p the pressure, f f1, . . . ,fd
T
the prescribed body force and n the kinematic viscosity. Given acharacteristic length L and a characteristic velocity U, the Reynolds
number is defined as ReUL=n.To introduce the variational formulation of (2.1), we set
X H10Od, Y L2Od, M L20O qAL
2O :
ZO
q dx 0
,
and define a,, b, ,, d, as
au,v nru,rv, bu,v,w 12 u rv,w12u rw,v,
dv,q div v,q, 8u,v,wAX, qAM,
where , is the standard inner-product of L2Ol l 1,2,3.
As mentioned above, a further assumption on O is needed:
H0. Assume that O is regular in the sense that the unique
solution u,qAX M of the steady Stokes problem
nurq g, div u 0 in O, uj@O 0,
for prescribed gAY exists and satisfies
JuJ2,O JqJ1,OrcJgJ0,O:
It is noted that the validity of Assumption H0 is known if@O is C2,
or ifO is a two-dimensional convex polygon; see [32,33].
With the above notations, the variational formulation of (2.1)
reads: find a pair u,pAX M such that
au,v bu,u,vdv,p f,v, 8vAX, 2:2a
D1
D2 D4
D3
Fig. 1. Triangulation (left) and decomposition (right) of the solution domain.
Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 12621279 1263
7/29/2019 Fem Navier Stokes 1
3/18
du,q 0, 8qAM: 2:2b
Defining
N : supu,v,wAX,u,v,wa0
jbu,v,wj
JruJ0,OJrvJ0,OJrwJ0,O,
we have the following existence and uniqueness results (cf. [34,35]).
Lemma 2.1. Given fAX0 (the dual space of X), there exists at least a
solution pair u,pAX M satisfying (2.2) and
JruJ0,Orn1
JfJ1,O, JfJ1,O supvAX,va0
jf,vj
JrvJ0,O: 2:3
Moreover, ifn and f satisfy the following uniqueness condition
1NJfJ1,O
n240, 2:4
then the solution pair (u,p) of problem (2.2) is unique.
2.2. Mixed finite element spaces
To describe the mixed finite element approximations of
problem (2.2), let us assume ThO fKg be a shape-regular
triangulation (see, e.g., [35,31]) ofO into triangles or quadrilat-erals (if d 2), or tetrahedrons or hexahedrons (if d 3) with
mesh-size function h(x) whose value is the diameter hK of the
element K containing x, satisfying the following assumption:A0. Triangulation. There exists gZ1 such that
hgOrchx, 8 xAO, 2:5
where hO maxxAOhx is the largest mesh size of ThO. Some-
times, we shall use h instead ofhO for the mesh size on a domain
that is clear from the context.
Let XhO & H1Od,MhO & L
2O be two finite element sub-
spaces associated with a mesh ThO and
X0h O XhO \ H10 O
d, M0hO MhO \ L
20O:
Given a sub-domain G &O, we define XhG,
MhG,
and Th
G to bethe restriction of XhO, MhO and T
hO to G, respectively, and
set
Xh0 G fvAXhO : supp v & & Gg, Mh0 G fqAMhO : supp q & & Gg:
We shall not restrict our attention to any specific mixed finite
element space; rather we shall study a class of mixed finite element
spaces satisfying the following assumptions (cf. [11,3638]).
A1. Approximation. For each u,pAHt 1Gd HtGtZ1,
there exists an approximation phu,rhpAXhG MhG such that
Jh1uphuJ0,G JuphuJ1,GrchsGJuJ1 s,G, 0rsrt, 2:6
Jh1prhpJ1,G JprhpJ0,GrchsGJpJs,G, 0rsrt: 2:7
A2. Inverse estimate. For any v,
qA
XhG MhG, there holdJvJ1,GrcJh
1vJ0,G, JqJ0,GrcJh1qJ1,G: 2:8
A3. Superapproximation. For G &O, let oAC10 O withsupp o& & G. Then for any u,pAXhG MhG, there isv,qAXh0 G M
h0G such that
Jh1ouvJ1,GrcJuJ1,G, Jh1opqJ0,GrcJpJ0,G: 2:9
A4. Infsup condition. There exists a constant b40 such that
bJqJ0,Gr supvAX0
hG,
va0
div v,q
JrvJ0,G, 8qAM0h G: 2:10
We refer to [39] for some examples satisfying Assumptions
A1A4. For instance, the MINI finite elements [40] and the
P2P0 finite elements [41] satisfy Assumptions A1A4 when
t1, while the Taylor-Hood elements [42] and the augmentedP2P1 elements [43,44] satisfy Assumptions A1A4 when t2.
The mixed finite element approximation of problem (2.2)
reads: find a pair uh,phAX0h O M
0hO such that
auh,v buh,uh,vdv,ph f,v, 8vAX0h O, 2:11a
duh,q 0, 8qAM0h O: 2:11b
The following results on uh,
ph are classical (cf. [34,35]).
Lemma 2.2. Under Assumptions A0, A1 and A4, there exists a
small h040 such that for all hA0,h0, problem (2.11) admits a
unique solution uh,ph. Moreover, if u,pAHt 1O \ H10 O
d
HtO \ L20O, then the following error estimate holds:
JuuhJ1,O JpphJ0,OrchsJuJs 1,OJpJs,O, 1rsrt: 2:12
3. Parallel finite element algorithms
In this section, we first recall a parallel algorithm based on
local finite element computations proposed in [9] for the steady
NavierStokes equations, and then give an analysis for improve-
ment and introduce our new parallel finite element algorithmbased on two-grid discretization.
Let us first divide O into a number of disjoint sub-domains
D1, . . . ,Dm, and then enlarge each Dj to obtain Oj such that
Dj & &Oj &O j 1,2, . . . ,m, here Dj & &Oj &O means that
dist@Dj\@O,@Oj\@O40). These Ojs are an overlapping decom-
position of O. Assume THO to be a shape-regular coarse grid
with size Hbh, ThOj a local shape-regular fine grid of subdo-
main Oj and ThO a global fine grid which coincides with the
local fine grid in sub-domainOj. We are interested in obtaining an
approximate solution in sub-domains Dj j 1,2, . . . ,m with an
accuracy comparable to that of the classical finite element
solution uh,ph from ThO.
3.1. A parallel linearized algorithm
The parallel finite element algorithm based on two-grid dis-
cretization proposed in [9] for the stationary NavierStokes
equations reads:
Algorithm 1. Parallel linearized finite element algorithm.
1. Find a global coarse grid solution uH,pHAX0HO M
0HO
such that
auH,v buH,uH,vdv,pH f,v, 8vAX0HO,
duH,q 0, 8qAM0HO:
2. Find local fine grid corrections eh,j,Zh,jAX0h Oj M
0hOj
j 1, 2, . . . ,m in parallel:aeh,j,v beh,j ,uH,v buH,eh,j,vdv,Zh,j Rj,v, 8vAX
0h Oj,
deh,j,q duH,q, 8qAM0h Oj:
Table 1
Errors of the solutions obtained from Algorithm 1.
h H CPU(s) itC Jjruuh jJ0,O
JruJ0,OJjpph jJ0,O
JpJ0,OIph Rate
127
118
2 .393 3 0.003 819 17 1 4.13 79 0.16 368 6
164
132
8 .476 3 0.00072 539 7 2.2 5073 0.33 1027 2.1 288 3
1125
150
2 5.23 6 3 0 .0 00 19 08 47 0 .3 39 69 7 0 .0 48 23 35 2 .8 22 48
Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 126212791264
7/29/2019 Fem Navier Stokes 1
4/18
3. Set uh,ph uH,pH eh,j,Zh,j in Dj j 1,2, . . . ,m.
Here and hereafter,
Rj,v f,vauH,vbuH,uH,v dv,pH,
8vAX0h Oj, j 1,2, . . . ,m: 3:1
Remark 3.1. Similar parallel linearized algorithms were also
proposed for the stationary NavierStokes equations in [10,13],respectively. They differ from Algorithm 1 in that they solve a
different linearized problem on the fine grid; see [10,13] for
details.
Defining piecewise norms
JjruuhJj0,O Xm
j 1
JruuhJ20,Dj
0@
1A
1=2
,
JjpphJj0,O Xm
j 1
JpphJ20,Dj
0@
1A
1=2
,
we have the following error estimates (see [9]).
Theorem 3.1. Assume that Dj & &Oj &O j 1,2, . . . ,m, Assump-tions A0A4, Lemmas 2.1 and 2.2 hold, and uh,ph is obtained from
Algorithm 1. Then
JjruhuhJj0,O Jjphp
hJj0,OrcH
s 1JuJs 1,O JpJs,O, 1rsrt:
Consequently,
JjruuhJj0,OJjpphJj0,Orch
s Hs 1JuJs 1,OJpJs,O, 1rsrt:
Theorem 3.1 shows that if the ratio of coarse mesh size H to
fine mesh size h is suitably chosen, Algorithm 1 can yield the
same order of convergence rate as the classical Galerkin finite
element method and may provide asymptotically optimal errors
for the approximate solution.
However, detailed analysis and numerical tests showed that
still there is room to improve the above algorithm. To begin with,
let us consider the approximate pressure obtained from Algo-
rithm 1 and set
Iph :Xm
j 1
ZDj
ph dx
: 3:2
From problem (2.1), it is clear that the pressure is a function ofL2O which is defined up to an additive constant. This issue can
be circumscribed by considering one of the two solutions: the
first one is to look for a pressure with a vanishing average in O,
i.e., belonging to the space L20O; the second one is to seek a
pressure belonging to L2O\R. Obviously, Algorithm 1 adopts the
Table 2
Errors of the solutions obtained from Algorithm 2.
h H CPU (s) itC itF Jjruuh Jj0,O
JruJ0,O
JjpphJj0,OJpJ0,O
Iph Rate
127
118
2.697 3 3 0.00381339 0.000680126 2.47622e005
164
132
10.684 3 3 0.000720109 9.14859e 005 1.04331e006 1.94062
1125
150
29.382 3 2 0.000187746 3.08976e005 2.03235e007 1.99942
5 4.8 4.6 4.4 4.2 4 3.8 3.6 3.4 3.2
10
9.5
9
8.5
8
7.5
7
6.5
6
5.5
log(h)
log(error)
Algorithm 1
Algorithm 2
Classical FEM
h2
5 4.8 4.6 4.4 4.2 4 3.8 3.6 3.4 3.212
10
8
6
4
2
0
2
4
log(h)
log(error)
Algorithm 1
Algorithm 2
Classical FEM
h2
Fig. 2. H1
-error for the velocity (left) and L2
-error for the pressure (right).
Table 3
Comparison of the two strategies.
h H Zero-restriction of pressure on artificial
boundaries
Nonlinear corrections
JjpphJj0,OJpJ0,O
Iph JjpphJj0,O
JpJ0,O
Iph
127
118
0.000693688 1.48964e005 14.1271 0.093903
164
132
9.11414e005 7.93512e007 2.22346 0.220091
1125
150
3.65099e 005 1.10631e006 0.336198 0.0338185
Table 4
Errors of the classical finite element solutions.
h CPU (s) itF Jruuh J0,OJruJ0,O
JpphJ0,OJpJ0,O
JphJL1 O Rate
127
3 .8 78 3 0 .00 40 22 24 0 .0 005 05 81 2 .9 69 37 e011
164
24.354 3 0.000717938 8.98303e005 2.78859e014 1.99677
1125
118.539 3 0.000188292 2.35458e005 3.49921e010 1.99931
Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 12621279 1265
7/29/2019 Fem Navier Stokes 1
5/18
first solution to determine the pressure uniquely. From Algorithm
1, we can see that both the coarse grid approximation pH and the
fine grid corrections Zh,j j 1,2, . . . ,m have a vanishing averageon their respective solution domains, i.e., both
ROpH dx 0 andR
OjZh,j dx 0 j 1,2, . . . ,m are enforced. However, due to the
overlapping of sub-domainsOj j 1,2, . . . ,m, Algorithm 1 cannot
guarantee that the final result ph is really in L20O or Iph is small
enough to ensure that ph is an acceptable approximation of the
exact solution. In other words, Algorithm 1 cannot guarantee that
for j 1,2, . . . ,m, Zh,jAM0h Oj at Step 2 is exactly the local
correction of pH obtained at Step 1 in the subregion Dj; it may
be the correction of another coarse grid approximation of the
pressure. If this is the case, the approximate solution ph obtained
from Algorithm 1 may be far away from the exact solution.
Consequently, the accuracy of the approximate pressure obtained
from Algorithm 1 depends not only on the coarse grid size H(or, equivalently, the coarse grid solution pH), but also on whether
the fine grid corrections Zh,js j 1,2, . . . ,m at Step 2 are exactlythe corrections of the coarse grid solution pH in the disjoint
sub-domains.
3.2. New parallel finite element algorithm
Our new parallel finite element algorithm is motivated by the
above analysis and observation. We just modify Step 2 of Algo-
rithm 1 to more precisely calculate the corrections eh,j,Zh,j on theoverlapped sub-domains Oj j 1,2, . . . ,m. On one hand, unlike
Algorithm 1, we confine the pressure correction Zh,j in spaceL2O
j
\R by adding a homogeneous boundary condition on the
artificial boundary @Oj\@O of sub-domains Oj j 1,2, . . . ,m in the
fine grid local correction problems. On the other hand, we solve a
fully nonlinear correction problem by an iterative method such as
Newton and Picard iterations (see, e.g., [45,46]) independently on
0.4 0.2 0 0.2 0.4 0.6 0.8 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
y
Re=100
Present (H=1/32, h=1/64)
Present (H=1/64, h=1/128)
Classical FEM (h=1/128)
Ghia et al. (h=1/128)
0.4 0.2 0 0.2 0.4 0.6 0.8 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
y
Re=100
Present (H=1/32, h=1/64)
Present (H=1/64, h=1/128)
Classical FEM (h=1/128)
Ghia et al. (h=1/128)
0 0.2 0.4 0.6 0.8 10.3
0.25
0.2
0.15
0.1
0.05
0
0.05
0.1
0.15
0.2
x
u2velocity
Re=100
Present (H=1/32, h=1/64)
Present (H=1/64, h=1/128)
Classical FEM (h=1/128)
Ghia et al. (h=1/128)
0 0.2 0.4 0.6 0.8 10.3
0.25
0.2
0.15
0.1
0.05
0
0.05
0.1
0.15
0.2Re=100
u2velocity
x
Present (H=1/32, h=1/64)
Present (H=1/64, h=1/128)
Classical FEM (h=1/128)
Ghia et al. (h=1/128)
u1velocity u1velocity
Fig. 4. Comparison of u1-velocity profiles along the vertical centerline (top) and u2-velocity profiles along the horizontal centerline (bottom) for lid-driven cavity flow at
Re 100: (a) 2 2 sub-domains; (b) 4 4 sub-domains.
u1=1, u2=0
u1=0, u2=0
u1=0,
u2=0
u1=0,
u2=0
L = 1
L=1
Fig. 3. Schematic diagram of the lid-driven cavity flow.
Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 126212791266
7/29/2019 Fem Navier Stokes 1
6/18
sub-domains Oj j 1,2, . . . ,m. Specifically, we first approximate
the low frequency components of the solution to the NavierStokes
equations using a coarse grid on the entire domain as done in
Algorithm 1, and then use a fine grid to correct the resultant
residual in parallel on a collection of overlapped sub-domains,
where the local problems for these fine grid corrections are fully
nonlinear with homogeneous boundary conditions for the velocity
on all boundaries of the overlapped sub-domains and homoge-
neous conditions for the pressure only on the artificial boundaries.All of these nonlinear correction problems are solved in parallel by
an iterative method such as Newton and Picard iterations.
Setting
MGjh
Oj fqAMhOj : qjGj 0g, Gj @Oj\@O, 3:3
our new algorithm with Newton iteration for the nonlinear
correction problems reads:
Algorithm 2. New parallel finite element algorithm.
1. Find a global coarse grid solution uH,pHAX0HO M
0HO such
that
auH
,v buH
,uH
,vdv,pH
f,v, 8vAX0
HO,
duH,q 0, 8qAM0HO:
2. Find fine grid corrections eh,j,Zh,jAX0h Oj M
Gjh Oj j 1,2,
. . . ,m in parallel by the following iterative procedure:
aenh,j,v benh,j,e
n1h,j ,v be
n1h,j ,e
nh,j,vdv,Z
nh,j
ben1h,j ,en1h,j ,v Rj,v, 8vAX
0h Oj,
denh,j,q duH,q, 8qAMGjh
Oj, 3:4
for n 1,2, . . ., where the initial guess e0h,j 0 for j 1,2, . . . ,m.
3. Set uh,ph uH,pH eh,j,Zh,j in Dj j 1,2, . . . ,m.
Remark 3.2. In our new algorithm, we add zero restriction on the
artificial boundaries of sub-domains in the local correction
problems. It is noted that similar boundary conditions were used
in [4749] for the incompressible Stokes and NavierStokes
equations, respectively. Such a restriction does not lead to
singular problems because the zero Dirichlet boundary condition
for the pressure enforces a unique pressure solution.
Remark 3.3. Step 2 of the above new algorithm is the Newton
iterative method applied to the following local residual:
aeh,j,v beh,j,eh,j,vdv,Zh,j Rj,v, 8vAX0h Oj,
deh,j,q duH,q, 8qAMGjh Oj: 3:5
0.4 0.2 0 0.2 0.4 0.6 0.8 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
y
Re=1000
Present (H=1/32, h=1/64)
Present (H=1/64, h=1/128)
Classical FEM (h=1/128)
Ghia et al. (h=1/128)
0.4 0.2 0 0.2 0.4 0.6 0.8 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
y
Re=1000
Present (H=1/32, h=1/64)
Present (H=1/64, h=1/128)
Classical FEM (h=1/128)
Ghia et al. (h=1/128)
0 0.2 0.4 0.6 0.8 10.6
0.5
0.4
0.3
0.2
0.1
0
0.1
0.2
0.3
0.4
x
u2velocity
Re =1000
Present (H=1/32, h=1/64)
Present (H=1/64, h=1/128)
Classical FEM (h=1/128)
Ghia et al. (h=1/128)
0 0.2 0.4 0.6 0.8 10.6
0.5
0.4
0.3
0.2
0.1
0
0.1
0.2
0.3
0.4
x
u2velocity
Re =1000
Present (H=1/32, h=1/64)
Present (H=1/64, h=1/128)
Classical FEM (h=1/128)
Ghia et al. (h=1/128)
u1velocity u1velocity
Fig. 5. Comparison of u1-velocity profiles along the vertical centerline (top) and u2-velocity profiles along the horizontal centerline (bottom) for lid-driven cavity flow at
Re1000: (a) 2 2 sub-domains; (b) 4 4 sub-domains.
Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 12621279 1267
7/29/2019 Fem Navier Stokes 1
7/18
We can also employ other linearization methods to solve the
nonlinear correction problem (3.5). For example, the Picard
iterative method (see, e.g., [45,46]) applied to problem (3.5) reads:
aenh,j,v ben1h,j ,e
nh,j,vdv,Z
nh,j Rj,v, 8vAX
0h Oj,
denh,j,q duH,q, 8qAMGjh
Oj, 3:6
for n 1,2, . . ..
Remark 3.4. As one of the referees pointed out that the correc-
tions in the velocity and pressure fields can be viewed as
approximations of the discretization errors between the solutions
computed on the two different meshes (see (3.5) and (2.11),
respectively). This is, in a way, related to the residual-type
methods for a posteriori error estimation in finite element
analysis (cf. [5052]). We refer, for example, to [5355] for
such residual-type a posteriori error estimations for the steady
NavierStokes equations, and to [5659] for the unsteady
NavierStokes equations. However, the main philosophy behind
our present paper is that we should treat phenomena of different
scales by different tools [11], which is different from that of a
posteriori error estimation.
Remark 3.5. The approximation uh,ph obtained from our Algo-
rithm is piecewise defined. It is in general discontinuous. In the
case Di \ Dja| iaj, on the interface, we can simply take the
average of the two subdomains solutions as its solution (this
strategy was used in our numerical experiments). To obtain a
global continuous approximation, one can use an additional local
fine grid problem to smooth the solution uh,ph as done in [11].
For j 1,2, . . . ,m, defining
JRjJ1,Oj supvAH1
0Oj
d,
va0
jRj,vOj j
JrvJ0,Oj, 3:7
Nj supu,v,wAH1
0Oj
d,
u,v,wa0
jbu,v,wj
JruJ0,OjJrvJ0,OjJrwJ0,Oj, 3:8
we have the following error estimate for our new parallel algorithm.
Theorem 3.2. Suppose that the conditions of Theorem 3.1 are valid
and the following stability conditions hold:
25Nj3n2
JRjJ1,Ojo1, j 1,2, . . . ,m: 3:9
Then the approximate solution uh,ph obtained from Algorithm 2 has
the following error estimate:
JjruuhJj0,OJjpphJj0,O
rchs Hs 1JuJs 1,O JpJs,O, 1rsrt:
Proof. From Lemmas 4.2 and 5.2 in [46], we obtain that, under
the stability condition (3.9), the iterative procedure (3.4) is stable
0.5 0 0.5 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
y
Re=5000
Present (H=1/32, h=1/64)
Present (H=1/64, h=1/128)
Classical FEM (h=1/128)
Ghia et al. (h=1/256)
0.5 0 0.5 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
y
Re=5000
Present (H=1/32, h=1/64)
Present (H=1/64, h=1/128)
Classical FEM (h=1/128)
Ghia et al. (h=1/256)
0 0.2 0.4 0.6 0.8 10.8
0.6
0.4
0.2
0
0.2
0.4
0.6Re=5000
u2velocity
x
Present (H=1/32, h=1/64)
Present (H=1/64, h=1/128)
Classical FEM (h=1/128)
Ghia et al. (h=1/256)
0 0.2 0.4 0.6 0.8 10.8
0.6
0.4
0.2
0
0.2
0.4
0.6
x
u2velocity
Re=5000
Present (H=1/32, h=1/64)
Present (H=1/64, h=1/128)
Classical FEM (h=1/128)
Ghia et al. (h=1/256)
u1velocity u1velocity
Fig. 6. Comparison of u1-velocity profiles along the vertical centerline (top) and u2-velocity profiles along the horizontal centerline (bottom) for lid-driven cavity flow at
Re 5000: (a) 2 2 sub-domains; (b) 4 4 sub-domains.
Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 126212791268
7/29/2019 Fem Navier Stokes 1
8/18
and convergent for all j 1,2, . . . ,m. By a similar argument as that
used in the proof of Theorem 4.2 in [9] and Theorem 3.2 in [37],
we can easily finish the proof. &
Remark 3.6. A fully nonlinear problem on the coarse grid needs to
be solved both in Algorithms 1 and 2. We usually solve this nonlinear
NavierStokes problem using either the Newton method or the
Picard method (see, e.g., [45,46]). From the definitions of
JfJ1,O, JRjJ1,Oj
,N and Nj j 1,2, . . . ,m (see (2.3), (3.7), (2.4)
and (3.8), respectively), we see that when the Newton iterative
method (which needs the stability condition 25NJfJ1,O=3n2o1;
see [46]) is employed to solve the coarse grid problem, the stability
conditions (3.9) are apparently valid. Therefore, no stricter conditions
than those of Algorithm 1 are required for our new Algorithm 2.
Throughout this paper, we assume that the nonlinear problems are
uniquely solvable by the above mentioned iterative methods and the
corresponding conditions for these methods hold.
Comparing Algorithm 2 with Algorithm 1, we can see that the
difference between the two algorithms lies in Step 2. First, unlike
Algorithm 1 where the correction problems are linear, the local
correction problems in our new algorithm are nonlinear. Second,
our new algorithm applies a homogeneous boundary condition
for pressure on the artificial boundary @Oj\@O of sub-domainsOj j 1,2, . . . ,m in the nonlinear correction problems. The
homogeneous boundary condition on the artificial boundaries of
overlapped sub-domains for the pressure ensures that in
Dj j 1,2, . . . ,m, the computed results Zh,j j 1,2, . . . ,m are
exactly the corrections of pH and hence the final result ph is in
L20O or has a small value of Iph .
From Algorithm 2 we can see that our new parallel algorithm is
based on a global coarse grid nonlinear problem and local fine grid
nonlinear problems. There is no communication between processors
in the solving process of the local correction problems. If we allow all
processors to simultaneously compute the coarse grid solution, our
algorithm only requires an existing sequential solver as sub-problem
solver and hence allows existing sequential PDE codes to run in a
parallel environment with a little investment in recoding: given an
existing or black-box sequential NavierStokes equations solver, our
algorithm only requires the application of the solver on overlapped
sub-domains and its application on a global coarse mesh. This is a
very attractive feature of our algorithm.
4. Numerical results
In this section, we shall report some numerical results to
demonstrate the efficiency of our new parallel algorithm. The testcases include a simple problem with known analytical solution,
the lid-driven cavity flow, the backward-facing step flow, and the
0.5 0 0.5 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
y
Re=7500
Present (H=1/32, h=1/64)
Present (H=1/64, h=1/128)
Ghia et al. (h=1/256)
0.5 0 0.5 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
y
Re=7500
Present (H=1/32, h=1/64)
Present (H=1/64, h=1/128)
Ghia et al. (h=1/256)
0 0.2 0.4 0.6 0.8 10.8
0.6
0.4
0.2
0
0.2
0.4
0.6
x
u2velocity
Re=7500
Present (H=1/32, h=1/64)
Present (H=1/64, h=1/128)
Ghia et al. (h=1/256)
0 0.2 0.4 0.6 0.8 10.8
0.6
0.4
0.2
0
0.2
0.4
0.6
x
u2velocity
Re=7500
Present (H=1/32, h=1/64)
Present (H=1/64, h=1/128)
Ghia et al. (h=1/256)
u1velocity u1velocity
Fig. 7. Comparison of u1-velocity profiles along the vertical centerline (top) and u2-velocity profiles along the horizontal centerline (bottom) for lid-driven cavity flow at
Re7500: (a) 2 2 sub-domains; (b) 4 4 sub-domains.
Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 12621279 1269
7/29/2019 Fem Navier Stokes 1
9/18
flow past a circular cylinder. The routine UMFPACK [60] is used to
solve the linear systems arising from each nonlinear iteration.
In all the numerical experiments, the second-order Taylor-Hood
elements are used for the finite element discretization.
4.1. Analytical solution
In this test case, O is the unit square 0,1 0,1 in R2. we set f
and the boundary conditions such that the exact solution of thestationary NavierStokes equations is given by
u1 sin2
pxsin2py,
u2 sin2pxsin2
py,
p cospx:
The mesh consists of triangular elements which are obtained by
dividingO (or Oj, j 1,2, . . . ,m) into sub-squares of equal size and
then drawing the diagonal in each sub-square; see Fig. 1 (left).
We divide O 0,1 0,1 into four disjoint subdomains
D1 0,12 0,
12 , D2 0,
12
12,1,
D3 12 ,1 0,12 , D4 12 ,1 12,1,
and then extend each sub-domain Dj j 1,2,3,4 outside with an
extra layer of size h to obtain Oj j 1,2,3,4; see Fig. 1(right).
These Ojs are composed of an overlapping decomposition of O.
We compute the finite element solutions on sub-domains
Oj j 1,2,3,4 independently by using Algorithms 1 and 2,
respectively. The coarse grid nonlinear problem is solved by
Newton iterative method and convergence is achieved when the
relative L2-error of the successive iterative velocities is within a
fixed tolerance of 106, i.e., the following condition is satisfied:
Jun 1H unHJ0,O
Jun 1H J0,Oo10
6,
4:1
where un 1H is the n1-th iterative solution. In our new Algo-
rithms 2, the stopping criterion for the local nonlinear correction
problems on Oj j 1,2, . . . ,m is
Jen 1h,j enh,jJ0,Oj
Jen 1h,j J0,Ojo106: 4:2
We set n 0:1 and compute the finite element solutions withfine meshes of size h n3 n 3,4,5 and corresponding coarse
meshes of size H satisfying 2H3 h2. The numerical results are
listed in Tables 1 and 2, respectively, where the CPU time is the
maximum of CPU time taken by the algorithms over the four
overlapped sub-domains, which includes the mesh generation time,the time spent on solving problems both on coarse and fine grids,
and the error computing time. itC stands for the nonlinear iterations
count satisfying the stopping criterion (4.1) for the coarse grid
0.5 0 0.5 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
y
Re=10000
Present (H=1/32, h=1/64)
Present (H=1/64, h=1/128)
Ghia et al. (h=1/256)
0.5 0 0.5 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
u1velocity
y
Re=10000
Present (H=1/32, h=1/64)
Present (H=1/64, h=1/128)
Ghia et al. (h=1/256)
0 0.2 0.4 0.6 0.8 10.8
0.6
0.4
0.2
0
0.2
0.4
0.6
x
u2velocity
Re = 10000
Present (H=1/32, h=1/64)
Present (H=1/64, h=1/128)
Ghia et al. (h= /256)
0 0.2 0.4 0.6 0.8 10.8
0.6
0.4
0.2
0
0.2
0.4
0.6
x
u2velocity
Re=10000
Present (H=1/32, h=1/64)
Present (H=1/64, h=1/128)
Ghia et al. (h=1/256)
u1velocity
Fig. 8. Comparison of u1-velocity profiles along the vertical centerline (top) and u2-velocity profiles along the horizontal centerline (bottom) for lid-driven cavity flow at
Re 10 000: (a) 2 2 sub-domains; (b) 4 4 sub-domains.
Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 126212791270
7/29/2019 Fem Navier Stokes 1
10/18
X
Y
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
X
Y
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
X
Y
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
10.90.80.70.60.50.40.30.2
0.10.060.040.020-0.02-0.05-0.1
X
Y
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
10.90.80.70.60.50.40.30.2
0.10.060.040.020-0.02-0.05-0.1
Fig. 9. Computed streamlines (top) and isobars (bottom) for lid-driven cavity flow at Re1000: (a) 2 2 sub-domains; (b) 4 4 sub-domains.
X
Y
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
X
Y
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
X
Y
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1 0.65
0.60.550.50.450.40.350.30.250.20.150.090.0750.0650.050.040.020-0.02-0.03-0.05
X
Y
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1 0.65
0.60.550.50.450.40.350.30.250.20.150.090.0750.0650.050.040.020-0.02-0.03-0.05
Fig. 10. Computed streamlines (top) and isobars (bottom) for lid-driven cavity flow at Re 5000: (a) 2 2 sub-domains; (b) 4 4 sub-domains.
Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 12621279 1271
7/29/2019 Fem Navier Stokes 1
11/18
problem, while itF is the maximum of iterations counts satisfying the
stopping criterion (4.2) for the fine grid local nonlinear correction
problems for our new algorithm. Iph is defined by (3.2). The
convergence rates with respect to mesh parameter h are computed
by the formula logEi=Ei 1=loghi=hi 1, where Ei and Ei 1 are
the relative errors JjruuhJj0,O JjpphJj0,O=JruJ0,O JpJ0,O
corresponding to the fine meshes of sizes hi and hi 1, respectively.
According to the mixed finite element spaces we choose and
the relationship between the mesh sizes Hand h, i.e., HOh2=3
,from Theorems 3.1 and 3.2, we have
JjruuhJj0,OJjpphJj0,O % ch
2:
The results shown in Tables 1 and 2 support the above estimate
both for Algorithm 1 and our new Algorithm 2; see Fig. 2.
However, from Table 1 we can see that the computed results
for the pressure by Algorithm 1 are inaccurate. Although both
the coarse grid solution pH and the local fine grid corrections
Zh,j j 1,2,3,4 are of average-vanishing on O and Oj j 1,2,3,4,respectively, the accuracy of the pressure is very poor and the
values of Iph are far from zero; this is predicted by our analysis in
Section 3.1. While from Table 2, we can see that with a homo-
geneous condition on the artificial boundaries of sub-domains for
the pressure corrections and by several nonlinear iterations forthe local correction problems, our new algorithm yields a reason-
able approximate solution.
To investigate the contributions of the modification strategies
(i.e., the zero restriction of pressure on the artificial boundaries
and the nonlinear version of the corrections) to the improvement
on the approximations of pressure, we computed the finite
element solutions with each strategy separately. Numerical
results listed in Table 3 show that the improvement on the
approximations of pressure mainly results from the zero restric-
tion of pressure on the artificial boundaries, which verifies our
previous analysis in Section 3.1.
Comparing Table 1 with Table 2, we can see that our new
algorithm has much better performance than Algorithm 1. As for
the CPU time, our new algorithm spends a little more thanAlgorithm 1. However, compared to the classical finite element
method, our new algorithm saves a large amount of computa-
tional time with a very comparable accuracy for the solutions; see
Tables 2, 4 and Fig. 2, respectively.
4.2. Lid-driven cavity flow
For this test case, we consider the 2D lid-driven cavity flow
which is a well-known benchmark problem and numerically
investigated by many researchers (cf. [6163]). This problem is
defined in the unit square. With zero source external force,
velocities are zero on all boundaries except the top one
(the lid), which has the driving horizontal velocity set to unity;
see Fig. 3. The Reynolds number for this problem is defined asRe UL=n, where Uis the velocity of the top lid and L is the lengthof the side wall.
For the 2D lid-driven cavity flow problem, it is well documen-
ted that to ensure the convergence of the iterative method used
for the nonlinear NavierStokes system so as to generate an
X
Y
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
X
Y
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
X
Y
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
10.60.550.5
0.450.40.350.30.250.20.150.090.0750.0650.050.040.020-0.02-0.03-0.05
X
Y
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
10.60.550.5
0.450.40.350.30.250.20.150.090.0750.0650.050.040.020-0.02-0.03-0.05
Fig. 11. Computed streamlines (top) and isobars (bottom) for lid-driven cavity flow at Re7500: (a) 2 2 sub-domains; (b) 4 4 sub-domains.
Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 126212791272
7/29/2019 Fem Navier Stokes 1
12/18
approximate solution, fine enough meshes are necessary as the
Reynolds number increases. For example, based on the velocity
pressure formulation of the NavierStokes equations, Layton et al.
[64] reported that at Re3200, the classical finite element
method combined with a continuation method failed to converge
on a 31 31 grid mesh. Using the classical finite element method,
X
Y
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
X
Y
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
X
Y
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
0.60.550.50.450.40.350.30.250.20.150.090.0750.0650.050.040.020-0.02-0.03-0.05
X
Y
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
0.60.550.50.450.40.350.30.250.20.150.090.0750.0650.050.040.020-0.02-0.03-0.05
Fig. 12. Computed streamlines (top) and isobars (bottom) for lid-driven cavity flow at Re10 000: (a) 2 2 sub-domains; (b) 4 4 sub-domains.
(0, 0.5)
(0, -0.5) u1 = u2 = 0
u1 = u2 = 0
(30, -0.5)
(30, 0.5)
u1 = 24y (0.5 - y)
u2 = 0
u1 = u2 = 0
-p + u1x
u2 = 0
= 0
Fig. 13. Schematic diagram of the backward-facing step flow.
0.2 0 0.2 0.4 0.6 0.8 1 1.20.5
0.4
0.3
0.2
0.1
0
0.1
0.2
0.30.4
0.5
yPresent (x=7)
Present (x=15)
Gartling (x=7)
Gartling (x=15)
20 15 10 5 0 50.5
0.4
0.3
0.2
0.1
0
0.1
0.2
0.30.4
0.5
y
Present (x=7)Present (x=15)
Gartling (x=7)
Gartling (x=15)
0.
16
0.
17
0.
18
0.
19
0.
2
0.
21
0.
22
0.
23
0.
24
0.
25
0.5
0
0.5
pressure
y
Present (x=7)
Present (x=15)
Gartling (x=7)
Gartling (x=15)
u1velocity u2velocity x 103
Fig. 14. Comparison of u1-velocity (left), u2-velocity (middle) and pressure (right) at various downstream locations for backward-facing step flow at Re800.
Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 12621279 1273
7/29/2019 Fem Navier Stokes 1
13/18
Wang [65] was just able to compute the solution at Reynolds
numbers up to Re5000 on a 81 81 uniform grid mesh. Based
on the stream function-vorticity formulation of the NavierStokes
equations, using pseudo-time derivations and a finite difference
method, Ertural et al. [62] reported that they could not get a
steady solution at Re7500 on a 129 129 grid mesh; while
using a finer 257 257 grid mesh, they were able to obtain a
steady solution at Reynolds numbers up to 12 500.
In our algorithm, a nonlinear NavierStokes problem needs to
be solved both on the coarse and fine grids. In view of the above
remarks, to ensure that a coarse grid solution can be obtained at
high Reynolds numbers, we incorporate our parallel method with
the defectcorrection method (cf. [64,65]) which can yield an
approximate solution on a relatively coarse grid compared to the
classical finite element method. The defectcorrection method
consists of an initial defect step followed by serval correction
0 5 10 15 20 25 300
0.05
0.1
0.15
0.2
0.25
x
y
Lower wall
Upper wall
0 5 10 15 20 25 300.015
0.01
0.005
0
0.005
0.01
x
y
Lower wall
Upper wall
Fig. 15. Pressure (left) and shear stress (right) profiles along upper and lower channel walls for backward-facing step flow at Re800.
Table 5
Comparison of the normalized (by the step height) length (Lm) of the main recirculation region
downstream the step, the separation location (Xs), the reattachment location (Xr) and the length
Ls XrXs of the second recirculation region on the upper wall for the backward-facing step flow
at Re 800.
Reference Lm Xs Xr Ls XrXs
Gartling [66] 12.20 9.70 20.96 11.26
Erturk [67] 11.83 9.48 20.55 11.07
Barton [68] 12.03 9.64 20.96 11.32
Keskar and Lyn [69] 12.19 9.71 20.96 11.25
Grigoriev and Dargush [70] 12.18 9.70 20.94 11.24
Present 12.15 9.67 20.90 11.23
Re = 100
Re = 500
Re = 800
Re = 1000
Fig. 16. Computed streamlines for backward-facing step flow at various Reynolds numbers.
Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 126212791274
7/29/2019 Fem Navier Stokes 1
14/18
steps. In the defect step, an artificial viscosity parameter aOhis added to the kinematic viscosity as a stability factor, and the
system is then anti-diffused in the correction steps; see, for
example, [64,65] for details.
We compute the solutions by our new parallel algorithm on
uniform meshes of sizes H 132 ,h 1
64 and H1
64 ,h 1
128, respec-
tively, and compare our computed results with those obtained by the
classical finite element method and those of Ghia et al. [61] where
the computations were based on the vorticity-stream functionformulation of the NavierStokes equations and using the coupled
strongly implicit multigrid method. The nonlinear problems both on
coarse and fine grids are solved by the Picard iterative method
combined with the defectcorrection method, where the stability
factor a is chosen as a 0:05h and three-step corrections (within thedefectcorrection method) are involved. The corresponding stopping
criterion for the nonlinear iterations is that the relative L2-error of
two successive iterates of velocity is within a fixed tolerance of 106.
We compute an approximate solution at Re 100,1000,5000,
7500 and 10 000 for the lid-driven cavity flow with 2 2 and 4 4
sub-domains, respectively, where the overlapped sub-domains are
constructed by extending each disjoint sub-domain outside with an
extra layer of size h. Figs. 48 plot the computed u1 component of
velocity along the vertical centerline and u2
component of velocity
along the horizontal centerline, compared with those of Ghia et al.
[61], where much finer 129 129 (for Re100,1000) and 257 257
(for Re 5000,7500,10 000) grid meshes were used, and those
obtained by the classical finite element method on a uniform mesh
of size h 1128. It is worth mentioning that at Re7500 and 10 000,
the classical finite element method is not able to yield an approx-
imate solution since the iterations for the nonlinear system do not
converge. From Figs. 48 we can see that the accuracy of the
computed solutions is comparable to those of Ghia et al. [61] andthe classical finite element solutions. As expected, the computed
results on grids of sizes H 164 ,h 1
128 are better than those of
H 132 ,h 1
64. Figs. 912 depict the numerical streamlines and isobars
computed by our new algorithm with H 164 ,h 1
128 and a 0:08 h.
4.3. Backward-facing step flow
In this example, we consider the 2D backward-facing step flow
which is a significant test problem for validating the robustness of a
NavierStokes solver. The literature offers many numerical and
experimental studies on 2D steady incompressible flows over a
backward-facing step. Flow features are known to depend on the
Reynolds number, the boundary conditions and the geometrical
parameters such as the step height and the channel height.
Re = 100
Re = 500
Re = 800
Re = 1000
Fig. 17. Computed isobars for backward-facing step flow at various Reynolds numbers.
100 200 300 400 500 600 700 800 9001000
2
4
6
8
10
12
14
Lm
Re
Present
Erturk
500 550 600 650 700 750 800 850 900 9501000
5
6
7
8
9
10
11
12
13
14
15
Re
Ls
Present
Erturk
Fig. 18. Normalized length (Lm) of the main recirculation region downstream the step (left) and the normalized length (Ls) of the second recirculation region on the upper
wall (right) with respect to the Reynolds number for the backward-facing step flow.
Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 12621279 1275
7/29/2019 Fem Navier Stokes 1
15/18
The problem we consider here is defined on a long channel
0,30 0:5,0:5, with no-slip conditions imposed on the top and
bottom walls, as well as the lower half of the left boundary. At the
inlet boundary, a fully developed parabolic velocity profile
u1 24y0:5y for 0ryr0:5 is specified, which leads to a
maximum inflow velocity of umax 1:5 and an average inflow
velocity of uave 1:0. The outlet boundary condition is set as
p n@u1=@x 0. See Fig. 13 for detailed geometry and boundary
conditions information. The Reynolds number for this problem isdefined as Re UaveL=n, where Uave 1 is the average velocity atthe inlet boundary and L 1 is the channel height. An interesting
feature of this problem is that the length of the recirculation zone
downstream the step is proportional (approximately) to the Rey-
nolds number.
We decompose the flow domain into 5 1 disjoint sub-domains
of equal size, and then extend each sub-domain outside with an
extra layer of size h. The quasi-uniform meshes sizes are set asH 132 ,h
164. First, we compute the approximate solution at
Re800 by our new parallel algorithm. In Fig. 14, the computed
velocity and pressure across the channel at x7 and 15 are
compared with those of Gartling [66]. From Fig. 14 we can see that
for the horizontal velocity and pressure, our numerical results agree
well with those of Gartling [66]. While for the vertical velocity, there
is a very little difference at x7. It is noted that due to the different
solutions to uniquely determining the approximate pressure, our
computed pressure is not the same as Gartlings [66]; there is a
constant difference between them. For the sake of comparison, our
pressure data presented in Fig. 14 were adjusted by making the
computed pressure equal to that of Gartling [66] at the lower
channel wall point x,y 7:0,0:5. Fig. 15 describes the computed
pressure and shear stress along the upper and lower channel walls,
which are also in perfect agreement with those of Gartling [66].
In Table 5, we compare the normalized (by the step height) length(Lm) of the main recirculation region downstream the step, the
separation location (Xs), the reattachment location (Xr) and the length
Ls XrXs of the second recirculation region on the upper wall
obtained by our new algorithm with those in the literature [6670].
The good agreement indicates the accuracy of our new algorithm.
Figs. 16 and 17 depict the computed streamlines and isobars at
different Reynolds numbers, respectively, where the vertical y-scale
is expanded in order to be able to see the details. Fig. 16 clearly
shows that the length of the main recirculation region downstream
the step increases as the Reynolds number grows. At Re500, a
second recirculation eddy forms on the upper wall, which becomes
Fig. 20. Nonoverlapping (left) and overlapping (right) domain decomposition for the flow past a circular cylinder.
-5 0 5 10 15 20-10
-5
0
5
10
-1 0 1 2
-1
-0.5
0
0.5
1
Fig. 21. The coarse grid for the flow past a circular cylinder: full (left) and zoom-in (right) view.
Table 6
Comparison of the separation angle y and wake length (Lw) for the flow past acircular cylinder at Re 10,20,40.
Re Reference y Lw
10 Dennis and Chang [71] 29.6 0.265
Ding et al. [72] 30.0 0.252
Kim et al. [73] 29.5 0.281
Present 29.8 0.257
20 Dennis and Chang [71] 43.7 0.94
Fornberg [74] 0.91
Ding et al. [72] 44.1 0.93
Kim et al. [73] 43.7 0.91
Present 43.7 0.937
40 Dennis and Chang [71] 53.8 2.345
Fornberg [74] 2.24
Ding et al. [72] 53.5 2.20
Kim et al. [73] 55.1 2.187
Present 53.4 2.258Fig. 19. Schematic diagram of the flow past a circular cylinder.
Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 126212791276
7/29/2019 Fem Navier Stokes 1
16/18
longer as the Reynolds number increases further. Fig. 18 depicts the
normalized length (Lm) of the main recirculation region downstream
the step and the normalized length (Ls) of the second recirculation
region on the upper wall with respect to the Reynolds number
compared with those of Erturk [67]. Considering the different grid
meshes, the different outflow locations and boundary conditions, the
results are in good agreement.
4.4. Flow past a circular cylinder
A circular cylinder of radius r0.5 resides in a rectangular domain
5,20 10,10, where the center of the circular cylinder is located
at the origin. A uniform flow with free-stream velocity U1 coming
from the left far field passes around the circular cylinder; see Fig. 19. A
no-slip boundary condition is specified on the surface of the cylinder,
-2 -1 0 1 2 3 4-2
-1
0
1
2
-2
-1
0
1
2
-2 -1 0 1 2 3 4-2
-1
0
1
2
-2 -1 0 1 2 3 4-2
-1
0
1
2
-2 -1 0 1 2 3 4-2
-1
0
1
2
-2 -1 0 1 2 3 4-2
-1
0
1
2
-2 0 2 4-2
-1
0
1
2
-2 0 2 4-2
-1
0
1
2
Fig. 22. Computed streamlines (left) and isobars (right) for the flow past a circular cylinder at Re 5,
10,
20,
40 (from top to bottom).
Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 12621279 1277
7/29/2019 Fem Navier Stokes 1
17/18
while on the inflow boundary, on the outflow boundary and on the
upper and lower wall boundaries, a potential flow velocity
u u1,u2 U1r2=x2 y2 2r2y2=x2 y22,2r2xy=x2 y22
is prescribed. The Reynolds number based on the free-stream velocity
U (here U1) and the cylinder diameter D (here D1) is defined asUD=n. It is well known that the stationary and symmetric flow past acircular cylinder becomes unstable for values of the Reynolds number
greater than 40, in which case the flow becomes periodic and
unsymmetricWe decompose the domain into six disjoint sub-domains, and
then enlarge each sub-domain by extending outside an extra layer
of size 0.5; see Fig. 20. The meshes sizes are H 12 ,h 14 with a
local refinement around the cylinder; see Fig. 21 for the coarse
grid where 5762 vertices are involved. In Table 6, we tabulated
the separation angle y and the length of the wake behind the
cylinder obtained by our new algorithm together with those in
the literature [7174], where good agreement is observed. The
computed streamlines and isobars around the cylinder are also
plotted in Fig. 22.
5. Conclusions
In this work we have proposed a new parallel finite elementalgorithm for the stationary NavierStokes equations. It is based
on a coarse grid nonlinear problem and local fine grid nonlinear
correction problems defined on overlapped sub-domains, and
hence allows existing sequential PDE codes to run in a parallel
environment without extensive recoding. Numerical simulations
of the lid-driven cavity flow, the backward-facing step flow and
the flow past a circular cylinder demonstrated the efficiency of
the proposed algorithm.
Acknowledgments
The authors thank the editor and reviewers for their valuable
comments and suggestions which led to a large improvement ofthe paper.
This work was supported by the National Research Foundation
(NRF) Grant funded by the Korean Government (MEST) (No. 2010-
0017532), the Natural Science Foundation of China (No. 11001061,
10971166), the National High Technology Research and Develop-
ment Program of China (863 Program: 2009AA01A135) and the
Ph.D. Research-Starting Foundation of Guizhou Normal University,
China ([2010] Parallel Algorithms for Computational Fluid
Dynamics Problems).
References
[1] A. Toselli, O. Widlund, Domain Decomposition Methods: Algorithms and
Theory, Springer, Berlin, 2005.[2] H. Elman, V.E. Howle, J. Shadid, et al., A taxonomy and comparison of parallelblock multi-level preconditioners for the incompressible NavierStokesequations, J. Comput. Phys. 227 (2008) 17901808.
[3] S. Behara, S. Mittal, Parallel finite element computation of incompressibleflows, Parallel Comput. 35 (2009) 195212.
[4] C.A. Rivera, M. Heniche, R. Glowinski, P.A. Tanguy, Parallel finite elementsimulations of incompressible viscous fluid flow by domain decompositionwith Lagrange multipliers, J. Comput. Phys. 229 (2010) 51235143.
[5] Y.Q. Shang, Y.N. He, Parallel finite element algorithms based on full domainpartition for stationary Stokes equations, Appl. Math. Mech.Engl. Ed. 31 (5)(2010) 643650.
[6] Y.Q. Shang, Y.N. He, Parallel iterative finite element algorithms based on fulldomain partition for the stationary NavierStokes equations, Appl. Numer.Math. 60 (7) (2010) 719737.
[7] Y.Q. Shang, A parallel two-level linearization method for incompressible flowproblems, Appl. Math. Lett. 24 (2011) 364369.
[8] Y.N. He, L.Q. Mei, Y.Q. Shang, J. Cui, Newton iterative parallel finite elementalgorithm for the steady NavierStokes equations, J. Sci. Comput. 44 (1)
(2010) 92106.
[9] Y.N. He, J.C. Xu, A.H. Zhou, Local and parallel finite element algorithms for theNavierStokes problem, J. Comput. Math. 24 (3) (2006) 227238.
[10] F.Y. Ma, Y.C. Ma, W.F. Wo, Local and parallel finite element algorithms basedon two-grid discretization for steady NavierStokes equations, Appl. Math.Mech.Engl. Ed. 28 (1) (2007) 2735.
[11] J.C. Xu, A.H. Zhou, Local and parallel finite element algorithms based on two-grid discretizations, Math. Comput. 69 (2000) 881909.
[12] J.C. Xu, A.H. Zhou, Local and parallel finite element algorithms based on two-grid discretizations for nonlinear problems, Adv. Comput. Math. 14 (2001)293327.
[13] Y.Q. Shang, Y.N. He, Z.D. Luo, A comparison of three kinds of local and parallel
finite element algorithms based on two-grid discretizations for the stationaryNavierStokes equations, Comput. Fluids. 40 (2011) 249257.
[14] W. Layton, A two level discretization method for the NavierStokes equa-tions, Comput. Math. Appl. 5 (26) (1993) 3338.
[15] W. Layton, H.W.J. Lenferink, A multilevel mesh independence principle forthe NavierStokes equations, SIAM J. Numer. Anal. 33 (1) (1996) 1730.
[16] W. Layton, H.K. Lee, J. Peterson, Numerical solution of the stationary NavierStokes equations using a multilevel finite element method, SIAM J. Sci.Comput. 20 (1) (1998) 112.
[17] X.X. Dai, X.L. Cheng, A two-grid method based on Newton iteration for theNavierStokes equations, J. Comput. Appl. Math. 220 (2008) 566573.
[18] Y.N. He, A.W. Wang, A simplified two-level method for the steady NavierStokes equations, Comput. Meth. Appl. Mech. Engrg. 197 (2008) 15681576.
[19] H. Abboud, V. Girault, T. Sayah, A second order accuracy for a full discretizedtime-dependent NavierStokes equations by a two-grid scheme, Numer.Math. 114 (2009) 189231.
[20] J.C. Xu, Two-grid discretization techniques for linear and nonlinear PDEs,SIAM J. Numer. Anal. 33 (5) (1996) 17591777.
[21] M. Marion, R. Temam, Nonlinear Galerkin methods, SIAM J. Numer. Anal. 26(5) (1989) 11391157.[22] M. Marion, R. Temam, Nonlinear Galerkin methods: the finite element case,
Numer. Math. 57 (1990) 122.[23] A.A.O. Amni, M. Marion, Nonlinear Galerkin methods and mixed finite
element: two-grid algorithms for the NavierStokes equations, Numer. Math.68 (1994) 189213.
[24] Z.D. Luo, J. Zhu, A nonlinear Galerkin mixed element method and a posteriorierror estimator for the stationary NavierStokes equations, Appl. Math.Mech.Engl. Ed. 23 (10) (2002) 11941206.
[25] B.F. Smith, P.E. Bjrstad, W. Gropp, Domain Decomposition: Parallel Multi-level Methods for Elliptic Partial Differential Equations, Cambridge UniversityPress, Cambridge, 1996.
[26] A. Quarteroni, A. Valli, Domain Decomposition Methods for Partial Differ-ential Equations, Oxford Science Publications, London, 1999.
[27] F.N. Hwang, X.C. Cai, A parallel nonlinear additive Schwarz preconditionedinexact Newton algorithm for incompressible NavierStokes equations, J.Comput. Phys. 204 (2005) 666691.
[28] F.N. Hwang, X.C. Cai, A class of parallel two-level nonlinear Schwarz
preconditioned inexact Newton algorithms, Comput. Meth. Appl. Mech.Engrg. 196 (2007) 16031611.
[29] X.C. Cai, D.E. Keyes, L. Marcinkowski, Nonlinear additive Schwarz precondi-tioners and applications in computational fluid dynamics, Int. J. Numer.Meth. Fluids 40 (2002) 14631470.
[30] R. Adams, Sobolev Spaces, Academic Press Inc, New York, 1975.[31] P.G. Ciarlet, The Finite Element Method for Elliptic Problems, North-Holland,
Amsterdam, 1978.[32] J.G. Heywood, R. Rannacher, Finite element approximation of the nonsta-
tionary NavierStokes problem I: regularity of solutions and second-ordererror estimates for spatial discretization, SIAM J. Numer. Anal. 19 (2) (1982)275311.
[33] R.B. Kellogg, J.E. Osborn, A regularity result for the Stokes problem in aconvex polygon, J. Funct. Anal. 21 (1976) 397431.
[34] R. Temam, NavierStokes Equations: Theory and Numerical Analysis, North-Holland, Amsterdam, 1984.
[35] V. Girault, P.A. Raviart, Finite Element Methods for NavierStokes Equations:Theory and Algorithms, Springer-Verlag, Berlin Heidelberg, 1986.
[36] Y.N. He, J.C. Xu, A.H. Zhou, J. Li, Local and parallel finite element algorithmsfor the Stokes problem, Numer. Math. 109 (3) (2008) 415434.
[37] Y.Q. Shang, Z.D. Luo, A parallel two-level finite element method for theNavierStokes equations, Appl. Math. Mech.Engl. Ed. 31 (11) (2010)14291438.
[38] Y.Q. Shang, K. Wang, Local and parallel finite element algorithms based ontwo-grid discretizations for the transient Stokes equations, Numer. Algor. 54(2) (2010) 195218.
[39] D.N. Arnold, X. Liu, Local error estimates for finite element discretizations ofthe Stokes equations, RAIRO M2AN 29 (1995) 367389.
[40] D.N. Arnold, F. Brezzi, M. Fortin, A stable finite element for the Stokesequations, Calcolo 21 (1984) 337344.
[41] M. Fortin, Calcul numerique des ecoulements fluides de Bingham et desfluides Newtoniens incompressible par des methodes delements finis,Doctoral Thesis, Universite de Paris VI, 1972.
[42] P. Hood, C. Taylor, A numerical solution of the NavierStokes equations usingthe finite element technique, Comput. Fluids 1 (1973) 73100.
[43] M. Crouzeix, P.-A. Raviart, Conforming and nonconforming finite elementmethods for solving the stationary Stokes equations, RAIRO Anal. Numer. 7
(R-3) (1973) 3376.
Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 126212791278
7/29/2019 Fem Navier Stokes 1
18/18
[44] L. Mansfield, Finite element subspaces with optimal rates of convergence forstationary Stokes problem, RAIRO Anal. Numer. 16 (1982) 4966.
[45] H.C. Elman, D.J. Silvester, A.J. Wathen, Finite Elements and Fast IterativeSolvers: With Applications in Incompressible Fluid Dynamics, OxfordUniversity Press, Oxford, 2005.
[46] Y.N. He, J. Li, Convergence of three iterative methods based on finite elementdiscretization for the stationary NavierStokes equations, Comput. Meth.Appl. Mech. Engrg. 198 (2009) 13511359.
[47] A. Klawonn, L.F. Pavarino, Overlapping Schwarz methods for mixed linearelasticity and Stokes problems, Comput. Meth. Appl. Mech. Engrg. 165 (1998)233245.
[48] L.F. Pavarino, Indefinite overlapping Schwarz methods for time-dependentStokes problems, Comput. Meth. Appl. Mech. Engrg. 187 (2000) 3551.
[49] F.N. Hwang, Some parallel linear and nonlinear Schwarz methods withapplications in computational fluid dynamics, Ph.D. Dissertation, Universityof Colorado, 2004.
[50] M. Ainsworth, J.T. Oden, A posteriori error estimation in finite elementanalysis, Comput. Meth. Appl. Mech. Engrg. 142 (1997) 188.
[51] M. Ainsworth, J.T. Oden, A Posteriori Error Estimation in Finite ElementAnalysis, John Wiley & Sons, 2000.
[52] T.J. Barth, H. Deconinck, Error Estimation and Adaptive DiscretizationMethods in Computational Fluid Dynamics, Lecture Notes in ComputerScience and Engineering, vol. 25, Springer, 2003.
[53] H. Jin, S. Prudhomme, A posteriori error estimation of steady-state finiteelement solutions of the NavierStokes equations by a subdomain residualmethod, Comput. Meth. Appl. Mech. Engrg. 159 (1998) 1948.
[54] L. Machiels, J. Peraire, A.T. Patera, A posteriori finite element output boundsfor the incompressible NavierStokes equations: application to a naturalconvection problem, J. Comput. Phys. 172 (2001) 401425.
[55] M. Farhloul, S. Nicaise, L. Paquet, A priori and a posteriori error estimationsfor the dual mixed finite element method of the NavierStokes problem,Numer. Meth. Part. Diff. Eq. 25 (4) (2009) 843869.
[56] S. Prudhomme, J.T. Oden, A posteriori error estimation and error control forfinite element approximations of the time-dependent NavierStokes equa-tions, Finite Elem. Anal. Des. 33 (1999) 247262.
[57] J. Cao, Application of a posteriori error estimation to finite element simula-tion of incompressible NavierStokes flow, Comput. Fluids 34 (2005)972990.
[58] J. Hoffman, C. Johnson, A new approach to computational turbulencemodelling, Comput. Meth. Appl. Mech. Engrg. 195 (2006) 28652880.
[59] S. Berrone, M. Marro, Spacetime adaptive simulations for unsteadyNavierStokes problems, Comput. Fluids 38 (2009) 11321144.
[60] T.A. Davis, Available at: hhttp://www.cise.ufl.edu/research/sparse/umfpacki.[61] U. Ghia, K. Ghia, C. Shin, High-Re solutions for incompressible flow using the
NavierStokes equations and a multigrid method, J. Comput. Phys. 48 (1982)
387411.[62] E. Erturk, T. Corke, C. Gokcol, Numerical solutions of 2-D steady incompres-
sible driven cavity flow at high Reynolds numbers, Int. J. Numer. Meth. Fluids
48 (2005) 747774.[63] E. Erturk, Discussions on driven cavity flow, Int. J. Numer. Meth. Fluids 60
(2009) 275294.[64] W. Layton, H. Lee, J. Peterson, A defectcorrection method for the incom-
pressible NavierStokes equations, Appl. Math. Comput. 129 (2002) 119.[65] K. Wang, A new defect correction method for the NavierStokes equations at
high Reynolds numbers, Appl. Math. Comput. 11 (216) (2010) 32523264.[66] D.K. Gartling, A test problem for outflow boundary conditions-flow over a
backward-facing step, Int. J. Numer. Meth. Fluids 11 (1990) 953967.[67] E. Erturk, Numerical solution of 2D steady incompressible flow over a
backward-facing step, part I: high Reynolds number solutions, Comput.
Fluids 37 (2008) 633655.[68] I.E. Barton, The entrance effect of laminar flow over a backward-facing step
geometry, Int. J. Numer. Meth. Fluids 25 (1997) 633644.[69] J. Keskar, D.A. Lyn, Computations of a laminar backward-facing step flow at
Re800 with a spectral domain decomposition method, Int. J. Numer. Meth.
Fluids 29 (1999) 411427.[70] M.M. Grigoriev, G.F. Dargush, A poly-region boundary element method for
incompressible viscous fluid flows, Int. J. Numer. Meth. Eng. 46 (1999)
11271158.[71] S.C.R. Dennis, G.Z. Chang, Numerical solutions for steady flow past a circular
cylinder at Reynolds up to 100, J. Fluid Mech. 42 (1970) 471489.[72] H. Ding, C. Shu, K.S. Yeo, D. Xu, Simulation of incompressible viscous flows
past a circular cylinder by hybrid FD scheme and meshless least square-based finite difference method, Comput. Meth. Appl. Mech. Engrg. 193 (2004)
727744.[73] Y. Kim, D.W. Kim, S. Jun, J.H. Lee, Meshfree point collocation method for the
stream-vorticity formulation of 2D incompressible NavierStokes equations,
Comput. Meth. Appl. Mech. Engrg. 196 (2007) 30953109.[74] B. Fornberg, A numerical study of steady viscous flow past a circular cylinder,
J. Fluid Me ch. 98 (1980) 819855.
Y.Q. Shang et al. / Finite Elements in Analysis and Design 47 (2011) 12621279 1279
http://www.cise.ufl.edu/research/sparse/umfpackhttp://www.cise.ufl.edu/research/sparse/umfpackTop Related