Hybrid Parallel Compressible RANS Solvers by pcarruscag · Pull Request #861 · su2code/SU2
Proposed Changes
Extend the MPI+Threads framework to CEulerSolver, CNSSolver and the CTurb***Solver(s).
I can finally start showing what this effort is about in pictures, this is how the relatively small benchmark case from #716 scales on Imperial's cluster:

C is for cores, R for MPI ranks, and T for OpenMP threads per rank.
It may not look like much close to the ideal scaling but on 192 cores there are only 2.7k cells per core, and at different level of "hybridisation" this is what you get:

As mentioned in #789 this behaviour is due to the worse quality / quantity of coarse grids that can be generated as the domain is decomposed further and further.
EDIT
This is finished for now, apart from the odd routine that I will document below (and all the boundary conditions) every method called in SingleGrid_Iteration and MultiGrid_Iteration (including prolongations, smoothings and whatnot) is hybrid parallel.
Threads are started in those functions which means that for a RANS case we enter parallel regions only twice per iteration, this should give minimal overhead, I will update the scalability results once they converge (statistically :) )
I will document the changes by "themes" below.
Related Work
Another snapshot of #824
PR Checklist
- I am submitting my contribution to the develop branch.
- My contribution generates no new compiler warnings (try with the '-Wall -Wextra -Wno-unused-parameter -Wno-empty-body' compiler flags).
- My contribution is commented and consistent with SU2 style.
- I have added a test case that demonstrates my contribution, if necessary.
- I have updated appropriate documentation (Tutorials, Docs Page, config_template.cpp) , if necessary.