The field of nanophotonics aims at studying and exploiting the way light interacts with matter structured at the nanoscale. As light is squeezed down into nanometer-scale volumes, field enhancement effects occur resulting in new optical phenomena which can be exploited to challenge existing technological limits and deliver superior photonic devices. Nanophotonics encompasses a wide variety of topics, including metamaterials, plasmonics, high resolution imaging, quantum nanophotonics, and functional photonic materials. Previously viewed as a largely academic field, nanophotonics is now entering the mainstream, and will play an increasing role in the development of exciting new products, ranging from high efficiency solar cells to personalised health monitoring devices able to detect the chemical composition of molecules at ultralow concentrations.
Developed by Inria, DIOGENeS is a software suite dedicated to computational nanophotonics and is one of the EPEEC applications . This software suite integrates several variants of the Discontinuous Galerkin (DG) method, which is a blend of finite element and finite volume methods. A DG method relies on an arbitrary high order polynomial interpolation of the field unknowns within the cells of an unstructured mesh. Such a mesh can be locally adapted to the peculiarities of irregularly shaped structures, material interfaces with complex topography, geometrical singularities, etc. As a consequence, a DG method is particularly well adapted to accurately and efficiently deal with the multiscale characteristics of nanoscale light-matter interaction problems. Numerical kernels of the DIOGENeS core library have been initially adapted to high performance computing thanks to a classical SPMD strategy implemented with the MPI message-passing programming standard. The DGTD (Discontinuous Galerkin-Time Domain) solver considered in EPEEC is one simulator, which is built on top of the DIOGENeS core library.
This DGTD solver is a perfect candidate for massively parallel heterogeneous computing systems and in particular on systems leveraging accelerators. The motivation for moving to exascale in the context of nanophotonics is twofold. One the one hand, the underlying computational mesh may involve a large number of cells because of the multiscale nature of the considered problem. In addition, high order polynomial expansion of the electromagnetic field components within each cell of the mesh translates into a total number of degrees of freedom that may be very large (from several tens of millions to a few billions).
In the context of EPEEC, a collaboration between INRIA and BSC teams aimed at assessing the advantages and limitations of tasks with data dependencies to extend the already existing distributed memory parallelization.
The hybrid MPI+OmpSs parallelization of the novel version of this DGTD solver developed in the context of EPEEC has been used to simulate light propagation in a waveguide consisting of a chain of dielectric nanospheres (see Figure 1). A sample of performance figures for the strong scalability assessment of the hybrid MPI+OmpSs parallelization is shown in Figure 2. The problem that is considered here is challenging from the parallel scalability viewpoint because the presence of metallic, i.e., gold, nanospheres induces a computational load balance issue. Indeed, modeling the response of metallic nanostructures at optical frequencies requires to take into account a set of ordinary differential equations, which are coupled to the system of Maxwell equations but are solved only in the mesh cells that discretize the nanospheres. In this context, a fine grain task-based parallelization allows to mitigate to some extent this computational load balance issue. The preliminary results presented in Figure 2 are encouraging and have been used to define a roadmap for further and optimization of the hybrid MPI+OmpSs parallelization on one hand, and the developmet of GASPI+OmpSs parallelization before the end of the project. In particular, with the use of the GASPI programming model, the currently used point-to-point communication scheme in MPI will be replaced by a RDMA-based one-sided communication scheme, which is expected to further improve the overall scalability of the DGTD solver.
Figure 1: propagation of light in a waveguide consisting of a chain of gold nanospheres. Left: unstructured tetrahedral mesh of the computational domain. Right: snapshot of the module of the electric field.
Figure 2: propagation of light in a waveguide consisting of a chain of dielectric nanospheres. Strong scalability assessment of the hybrid MPI+OmpSs parallelization. DGTD-P2 refers to the DGTD solver with second order polynomial interpolation of the electromagnetic field within each mesh cell, and simulations are performed on 2 Marenostrum 4 nodes (2 MPI processes) with 1 to 24 tasks. DGTD-P4 refers to the DGTD solver with fourth order polynomial interpolation of the electromagnetic field within each mesh cell, and simulations are performed on 16 Marenostrum 4 nodes (16 MPI processes) with 1 to 24 tasks.