The EPEEC project aims to offer application developers a productive programming environment for parallel jobs. This can either be done transparently by using OmpSs@cluster with the underlying ArgoDSM library which automatically ports applications written in OmpSs@cluster to a multi-node system or by combining the parallel GASPI programming model with OmpSs. To do so the Task-Aware GASPI or TAGASPI library will be used.
TAGASPI, which has been developed in the INTERTWinE project, will be deployed with applications of the EPEEC project. The TAGASPI library extends the functionality of the standard GASPI library by providing new mechanisms for improving the interoperability between parallel task-based programming models, such as OpenMP or OmpSs-2, and RDMA GASPI operations. The TAGASPI approach targets hybrid applications that taskify the RDMA communications. GASPI provide explicit communication primitives to exploit inter-process parallelism while OmpSs exploits parallelism within a process (always intra-node). Thus, TAGASPI allows communication tasks which can run in parallel and overlap with computation tasks.
OmpSs-2 is a programming model developed by BSC which extends the OpenMP tasking model with richer support for asynchronous parallelism (tasks) and an alternative approach to the use of accelerator devices (such as GPUs and FPGAs) based on leveraging existing native kernels (such as CUDA and OpenCL). The GASPI standard and its implementation GPI-2, on the other hand, promote the use of one-sided communication, where one side (the initiator) has all the relevant information for performing the data movement, and weak synchronization, provided in the form of notifications, allow the remote process to be notified upon the completion of an operation.
For the coupling of GASPI and task-based runtimes such as OmpSs, the tasks need to know that the data transfer has been successfully finished in the receiving as well as on the sending side. Then it is possible to decide on buffer validity on the receiving side (allows local read) and buffer reusability on the sending side (allows local overwrite). In TAGASPI a local completion approach is followed by providing functionalities to GPI-2 which allow to wait on the local, i.e. sender-side completion of the GASPI write, to check that the data have been sent from the sender buffer and the buffer can be reused. To this end, the native notification mechanism of GPI-2 is used, while OmpSs-2 registers the tasks (i.e. the sender and the receiver) with a polling service which checks and waits for completion of communication. This requires an extension of the GASPI specification by tagging communication requests, and a unique tag id is required as additional parameters of the communication.
TAGASPI requires OmpSs-2 and the low-level operations extended GPI-2, as well as the typical GNU configuration tools (autoconf, automake, etc.), C and C++ compilers and the Boost library. TAGASPI follows the OmpSs-2 programming model and, in general, the calling to the basic communication routines of the extended GASPI follows the same conventions as the original GPI-2, making its usage more or less transparent to the user. After configuration and installation an application just needs to be compiled, as it is normally done with the Mercurium compiler, and linked to the TAGASPI and GPI-2 libraries.
A detailed example for the 2D heat equation can be found in this link: https://pm.bsc.es/gitlab/ompss-2/examples/heat.git