The purpose of their participation was to improve the performance of AVBP, a parallel CFD code that solves the three-dimensional compressible Navier-Stokes on unstructured and hybrid grids. The code is one of five applications that EPEEC is developing in order to create an inter-disciplinary co-design approach that can be showcased as technology demonstrators and this workshop offered the unique opportunity to be among the first to test the AMD architecture in the first European system of its kind and work along Atos and AMD engineers to envision future optimizations.
The results of the hackathon have been successful. They ported the AVBP solver to the AMD Rome system available at GENCI -TGCC (IRENE Joliot Curie), which showed a 1/3 performance dependency to bandwidth and 2/3 to compute when the application was characterized on the atchitecture. Strong scaling performance up to 130k cores was measured with openmpi and provided an acceleration of 75% without optimisations. Weak scaling up to 32k MPI ranks suggested that decimation of the processes by a factor 2 improves computational efficiency by up to 30%. This suggested a trade-off between MPI imbalance and decimation is possible if imbalance is higher than 30% to improve time to solution.
The workshop enabled the team to implement a thread based parallel model within the usually flat MPI AVBP code based on access-based coloring. This removed completely the need for any synchronization between threads a key feature for performance and to test it on both Skylake and AMD processors. The final code shows only a 10% overhead compared to full MPI on 36 threads again 36 MPIs on intel skylake and 4 MPIs with 32 threads each versus 128 MPIs on AMD Rome.
You may see the final presentation of the AVBP team here: https://ecfd.coriacfd.fr/images/b/b1/Ecfd3_final_project1.pdf