fix: Destory cuda graphs before setting weight streaming by keehyuna · Pull Request #3461 · pytorch/TensorRT

Description

When cuda graphs and weigh streaming are used together, cuda graphs is destroyed after setting the weight streaming.
Weight streaming recreates the context and load new module. Destroying cudagraphs with old reference caused application crash. Fix is to move cuda graphs reset before the weight streaming setting.

The timing of del is not entirely predictable in python. Moved cudagraph reset logic from del to dedicated reset_cudagraph method and it's called when exiting from CudaGraphsTorchTensorRTModule context block

Fixes #3460

Type of change

Please delete options that are not relevant and/or add your own.

  • Bug fix (non-breaking change which fixes an issue)

Checklist:

  • My code follows the style guidelines of this project (You can use the linters)
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas and hacks
  • I have made corresponding changes to the documentation
  • I have added tests to verify my fix or my feature
  • New and existing unit tests pass locally with my changes
  • I have added the relevant labels to my PR in so that relevant reviewers are notified