The Parallel Models of Coronal Polarization Brightness Calculation

26
1 DCABES 2009 China The Parallel Models of Coronal Polarization Brightness Calculation Jiang Wenqian

description

The Parallel Models of Coronal Polarization Brightness Calculation. Jiang Wenqian. Outline. Introduction pB Calculation Formula Serial pB Calculation Process Parallel pB Calculation Models Conclusion. Part Ⅰ. Introduction. - PowerPoint PPT Presentation

Transcript of The Parallel Models of Coronal Polarization Brightness Calculation

Page 1: The Parallel Models of Coronal Polarization Brightness Calculation

1DCABES 2009 China University Of Geosciences

The Parallel Models of Coronal Polarization Brightness Calculation

Jiang Wenqian

Page 2: The Parallel Models of Coronal Polarization Brightness Calculation

DCABES 2009 China University Of Geosciences 2

Outline

Introduction

pB Calculation Formula

Serial pB Calculation Process

Parallel pB Calculation Models

Conclusion

Page 3: The Parallel Models of Coronal Polarization Brightness Calculation

DCABES 2009 China University Of Geosciences 3

Part . IntroductionⅠ

Space weather forecast needs an accurate solar wind model for the solar atmosphere and the interplanetary space. The global model of corona and heliosphere is the basis of numerical space weather forecast, and the observation basis of explaining various relevant relations.

Meanwhile, three-dimensional numerical Magnetohydrodynamics (MHD) simulation is one of the most common numerical methods to study corona and solar wind.

Page 4: The Parallel Models of Coronal Polarization Brightness Calculation

DCABES 2009 China University Of Geosciences 4

Part . IntroductionⅠ

Besides, calculating and converting the generated coronal electron density to the coronal polarization brightness (pB) is the key method of comparing with observation results, and is important to validate the MHD models.

Due to the massive data and the complexity of the pB model, the computation will cost too much time to visualize the pB data in nearly real time while using a single CPU (or core).

Page 5: The Parallel Models of Coronal Polarization Brightness Calculation

DCABES 2009 China University Of Geosciences 5

Part . IntroductionⅠ

According to the characteristic of CPU/GPU computing environment, we analyze the pB conversion algorithm, implement two parallel models of pB calculation with MPI and CUDA, and compares the two models’ efficiency.

Page 6: The Parallel Models of Coronal Polarization Brightness Calculation

DCABES 2009 China University Of Geosciences 6

Part Ⅱ. pB Calculation Formula

pB is derived from electron-scattered photosphere radiation. It can be used in the inversion of coronal electron density and to validate numerical models. Taking limb darkening into account, pB calculation formula of a small coronal volume element is shown as followed :

(1)

(2)

(3)

])1[(sin2

20 BAN

III ert

2sincosA

]cos

sin1ln)sin31(

sin

cossin31[

8

1 22

2

B

Page 7: The Parallel Models of Coronal Polarization Brightness Calculation

DCABES 2009 China University Of Geosciences 7

Part Ⅱ. pB Calculation Formula

The polarization brightness image for comparing with the observation of coronagraph can be generated through integrating the electron density along the line of sight.

Density integral Process of pB Calculation

Page 8: The Parallel Models of Coronal Polarization Brightness Calculation

DCABES 2009 China University Of Geosciences 8

Part . Serial Ⅲ pB Calculation Process

The steps of the serial model of pB calculation on CPU with the experimental data are shown as below.

The serial process of pB calculation

Page 9: The Parallel Models of Coronal Polarization Brightness Calculation

DCABES 2009 China University Of Geosciences 9

Part . Serial Ⅲ pB Calculation Process

According to the serial process of pB calculation above, we implement it under the environment of G95 on Linux and Visual Studio 2005 on Windows XP respectively.

With being measured the time cost of each step, it is found that the most time-consuming part of the whole program is the calculation of pB values, accounting for 98.05% and 99.05% of the total time cost respectively.

Page 10: The Parallel Models of Coronal Polarization Brightness Calculation

DCABES 2009 China University Of Geosciences 10

Part . Serial Ⅲ pB Calculation Process

Therefore, in order to improve the performance to meet the command of getting coronal polarization brightness in nearly real-time, we should optimize the calculation part of pB values.

As the density integration of each point over solar limb along the line of sight is independent, the parallel computation method is very suitable for pB calculation.

Page 11: The Parallel Models of Coronal Polarization Brightness Calculation

DCABES 2009 China University Of Geosciences 11

Part Ⅳ. Parallel pB Calculation Models

Currently, parallelized MHD numerical calculation is mainly based on MPI.

With the development of high performance computation, using GPU architecture to solve intensive computation shows obvious advantages.

Based on this situation, it will be an efficient parallel solution to implement the parallel MHD numerical calculation using GPU.

We implement two parallel models based on MPI and CUDA respectively.

Page 12: The Parallel Models of Coronal Polarization Brightness Calculation

DCABES 2009 China University Of Geosciences 12

Part Ⅳ. Parallel pB Calculation Models

Experiment Environment Experimental Data

42×42×82(r, θ, φ) density data(den)321×321×481(x , y, z) cartesian coordinate grid321×321 pB values will be generated.

HardwareIntel(R) Xeon(R) CPU, E5405 @ 2.00GHz(8 CPUs)1GB memory NVIDIA Quadro FX 4600 GPU, 760MB Global Memory GDD

R3 SDRAM graphics card

(It owns G80 kernel architecture, 12 MPs and 128 SPs )

Page 13: The Parallel Models of Coronal Polarization Brightness Calculation

DCABES 2009 China University Of Geosciences 13

Part Ⅳ. Parallel pB Calculation Models

Experiment Environment Compiling Environment

CUDA-based parallel model Visual Studio 2005 on Windows XP CUDA 1.1 SDK

MPI-based parallel model G95 on Linux MPICH2

Page 14: The Parallel Models of Coronal Polarization Brightness Calculation

DCABES 2009 China University Of Geosciences 14

Part Ⅳ. Parallel pB Calculation Models

MPI-based Parallelized Implementation In the MPI environment, how the experiment

decomposes computing domain into sub-domains is shown as bellow.

Page 15: The Parallel Models of Coronal Polarization Brightness Calculation

DCABES 2009 China University Of Geosciences 15

Part Ⅳ. Parallel pB Calculation Models

MPI-based Parallelized Implementation

Page 16: The Parallel Models of Coronal Polarization Brightness Calculation

DCABES 2009 China University Of Geosciences 16

Part Ⅳ. Parallel pB Calculation Models

MPI-based Parallelized Implementation The final result shows that MPI-based parallel model

reaches a speedup of 5.8. As the experiment is implemented under the platform with 8 CPU cores, the speed-up ratio of the result is closed to its theoretical value.

Meanwhile, it is revealed that the MPI-based parallel solution for the experiment has balanced the utilization ratio of processors and the communication between processors.

Page 17: The Parallel Models of Coronal Polarization Brightness Calculation

DCABES 2009 China University Of Geosciences 17

Part Ⅳ. Parallel pB Calculation Models

CUDA-based Parallelized Implementation According to pB serial calculation process and the CU

DA architecture, we should put the calculation part into the Kernel function to implement the parallel program.

Since the calculation of density interpolation and the cumulative sum involved in every pB value are independent, we can use multi-threads to process the pB value calculation in the CUDA, and each thread calculates one pB value.

Page 18: The Parallel Models of Coronal Polarization Brightness Calculation

DCABES 2009 China University Of Geosciences 18

Part Ⅳ. Parallel pB Calculation Models

CUDA-based Parallelized Implementation However, the pB values to be calculated is much larg

er than the available thread number of GPU, so each thread should calculate multiple pB values. According to experimental conditions, the thread number is setting to 256 for each block so as to maximize the use of computing resources.

The block number depends on the ratio of pB number and thread number. In addition, since the access time of global memory is large, we can put some independent data to the shared memory to reduce data access time.

Page 19: The Parallel Models of Coronal Polarization Brightness Calculation

DCABES 2009 China University Of Geosciences 19

Part Ⅳ. Parallel pB Calculation Models

CUDA-based Parallelized Implementation The size of data put into shared memory is about

7KB, less than 16KB provided by GPU, so the parallel solution is feasible.

Moreover, the data-length array is read-only and its using frequency is very high, so the optimized strategy that the data-length array is migrated from shared memory into constant memory is adopted to further improve its access efficiency.

The CUDA-based parallel calculation process is shown as bellow.

Page 20: The Parallel Models of Coronal Polarization Brightness Calculation

DCABES 2009 China University Of Geosciences 20

Part Ⅳ. Parallel pB Calculation Models

Page 21: The Parallel Models of Coronal Polarization Brightness Calculation

DCABES 2009 China University Of Geosciences 21

Part Ⅳ. Parallel pB Calculation Models

Experiment results The pB calculation time of two models is shown in Table 1.

Table 1. The pB calculation time of serial models and parallel

models and their speed-up ratio

MPI( G95)

CUDA( Visual Studio 2005)

pB calculation time of serial models( s)

32.403 48.938

pB calculation time of parallel models( s)

5.053 1.536

Speed-up ratio 6.41 31.86

Page 22: The Parallel Models of Coronal Polarization Brightness Calculation

DCABES 2009 China University Of Geosciences 22

Part Ⅳ. Parallel pB Calculation Models

Experiment results The total performance of two models is as shown in Table 2.

Table 2. The total running-time of two parallel models and the speed-up ratios

compared with their serial models

MPI( G95)( s)

CUDA( Visual Studio 2005)

( s)

The speed-up ratio of running-time

Serial models 33.05 49.406 0.67

Parallel models 5.70 2.004 2.84

Page 23: The Parallel Models of Coronal Polarization Brightness Calculation

DCABES 2009 China University Of Geosciences 23

Part Ⅳ. Parallel pB Calculation Models

Experiment results Finally, we draw the coronal polarization brightness

image shown as bellow with using calculated data.

Page 24: The Parallel Models of Coronal Polarization Brightness Calculation

DCABES 2009 China University Of Geosciences 24

Conclusion

Under the same environment, pB calculation time of MPI-based parallel model costs 5.053 seconds while the serial model costs 32.403 seconds. The model’s speedup is 6.41.

The pB calculation time of CUDA-based parallel model costs 1.536 seconds while the serial model costs 48.936 seconds. The model’s speedup is 31.86.

The total running-time of CUDA-based model is 2.84 times than that of MPI-based model.

Page 25: The Parallel Models of Coronal Polarization Brightness Calculation

DCABES 2009 China University Of Geosciences 25

Conclusion

It finds that the CUDA-based parallel model is more suitable for pB calculation, and it provides a better solution for post-processing and visualizing the MHD numerical calculation results.

Page 26: The Parallel Models of Coronal Polarization Brightness Calculation

DCABES 2009 China University Of Geosciences 26

Thank you!!!