A hybrid MPI/OpenMP parallelization of the adaptive integral method for multi-core clusters
A hybrid of message passing and shared memory techniques is presented for scalable parallelization of the adaptive integral method (AIM), an FFT based algorithm, on clusters of identical multi-core processors. The proposed hybrid MPI/OpenMP parallelization scheme is based on a nested one-dimensional (1-D) slab decomposition of the 3-D auxiliary uniform grid and the associated AIM calculations: If there are M processors and T cores per processor, the scheme (i) divides the uniform grid into M slabs and MT sub-slabs, (ii) assigns each slab/sub-slab and the associated operations to one of the processors/cores, and (iii) uses MPI for inter-processor data communication and OpenMP for intra-processor data exchange. The MPI/OpenMP parallel AIM is used to accelerate the MOM solution of combined-field integral equations pertinent to the analysis of scattering from perfectly conducting surfaces. The scalability and efficiency of the implementation are investigated theoretically and verified numerically by solving benchmark scattering problems on a (near) petaflop supercomputing cluster of quad-core processors. The timing and speedup results on up to 1024 processors show that the proposed hybrid MPI/OpenMP parallelization exhibits better strong scalability (fixed problem size speedup) compared to pure MPI parallelization when multiple cores are used on each processor.