LU_SGS Preconditioner GPU Port by areenraj · Pull Request #2539 · su2code/SU2
Proposed Changes
Addition of Graph Partitioning Algorithms
There now exists a class of algorithms for dividing the Jacobian matrix into partitions that can be executed in parallel during forward or backward substitution (triangular solves). The implementation exists similar to the CMatrixVectorProduct class, with our analogue being the new CGraphPartitioning class. A child class can be derived for each algorithm with its own member functions for carrying out the partition, reorder, or modify operations.
Currently, only level scheduling (from Pedro's old commits) has been implemented. We can look into adding multicoloring algorithms as well. The selection of the type of method used is based on user-defined flags and currently defaults to level scheduling.
LU_SGS Preconditioner
The LU_SGS Preconditioner has now been fully ported to the GPU. This port, however, is still in its preliminary stages and will need some amount of optimization. It is divided into two parts - First and Second Symmetric iterations, which are their own separate kernels. A device Gaussian Elimination Function is introduced, as well, to perform the backward and forward substitution.
MVP Update
The matrix vector product algorithm now has fully coalesced access for our row-major matrices.
Minor Changes
The entire system of matrix variables has been streamlined into a separate structure - matrixParam. This allows us to pass multiple variables - necessary for the kernel function - through a single structure. It also contains helper functions to check for illegal memory accesses. The CUDA block size can now be specified as a config variable and defaults to 1024.
Related Work
Continuation of previous PR that was merged into develop - #2346
PR Checklist
Put an X by all that apply. You can fill this out after submitting the PR. If you have any questions, don't hesitate to ask! We want to help. These are a guide for you to know what the reviewers will be looking for in your contribution.
There are a LOT of warnings that are thrown while compiling in debug mode. But they are simply those that are generated due to the unnecessary comparison of an unsigned integer and zero. I'm not sure which line of code is causing these warnings and would appreciate any help.
- I am submitting my contribution to the develop branch.
- My contribution generates no new compiler warnings (try with --warnlevel=3 when using meson).
- My contribution is commented and consistent with SU2 style (https://su2code.github.io/docs_v7/Style-Guide/).
- I used the pre-commit hook to prevent dirty commits and used
pre-commit run --allto format old commits. - I have added a test case that demonstrates my contribution, if necessary.
- I have updated appropriate documentation (Tutorials, Docs Page, config_template.cpp), if necessary.