Hybrid Parallel AD Performance Improvements by jblueh · Pull Request #2039 · su2code/SU2
Proposed Changes
When accessing the adjoints, bounds checking acquires an internal lock in CoDiPack. This turned out to be a bottleneck in InitializeAdjoint and IterateDiscreteAdjoint. Therefore, we ensure sufficient size up front and skip the bounds checking.
Related Work
Previous work on hybrid parallel AD, like #1214, #1284, #1294.
PR Checklist
- I am submitting my contribution to the develop branch.
- My contribution generates no new compiler warnings (try with --warnlevel=3 when using meson).
- My contribution is commented and consistent with SU2 style (https://su2code.github.io/docs_v7/Style-Guide/).
- I have added a test case that demonstrates my contribution, if necessary.
- I have updated appropriate documentation (Tutorials, Docs Page, config_template.cpp), if necessary.