Improved Tape Statistics by jblueh · Pull Request #2235 · su2code/SU2

Proposed Changes

Prior to this PR, tape statistics were collected and printed for the tape of thread 0 of rank 0, with special handling for the memory usage of thread 0, which was reduced across MPI processes. This PR extends this to also take OpenMP parallel parts into account (threads other than thread 0), and reduces everything (not only used memory) across MPI processes.

Related Work

any prior work on hybrid AD

PR Checklist

  • I am submitting my contribution to the develop branch.
  • My contribution generates no new compiler warnings (try with --warnlevel=3 when using meson).
  • My contribution is commented and consistent with SU2 style (https://su2code.github.io/docs_v7/Style-Guide/).
  • I used the pre-commit hook to prevent dirty commits and used pre-commit run --all to format old commits.
  • I have added a test case that demonstrates my contribution, if necessary.
  • I have updated appropriate documentation (Tutorials, Docs Page, config_template.cpp), if necessary.