Improved Tape Statistics by jblueh · Pull Request #2235 · su2code/SU2
Proposed Changes
Prior to this PR, tape statistics were collected and printed for the tape of thread 0 of rank 0, with special handling for the memory usage of thread 0, which was reduced across MPI processes. This PR extends this to also take OpenMP parallel parts into account (threads other than thread 0), and reduces everything (not only used memory) across MPI processes.
Related Work
any prior work on hybrid AD
PR Checklist
- I am submitting my contribution to the develop branch.
- My contribution generates no new compiler warnings (try with --warnlevel=3 when using meson).
- My contribution is commented and consistent with SU2 style (https://su2code.github.io/docs_v7/Style-Guide/).
- I used the pre-commit hook to prevent dirty commits and used
pre-commit run --allto format old commits. - I have added a test case that demonstrates my contribution, if necessary.
- I have updated appropriate documentation (Tutorials, Docs Page, config_template.cpp), if necessary.