perf(fp-history): batch false positive history processing by valentijnscholten · Pull Request #14449 · DefectDojo/django-DefectDojo
Replaces the N+1 query pattern in false positive history with a single product-scoped DB query per batch, and switches per-finding save() calls to QuerySet.update() to eliminate redundant signal overhead. Changes: - Extract _fp_candidates_qs() as the single algorithm-dispatch helper shared by both single-finding and batch lookup paths - Add do_false_positive_history_batch() which fetches all FP candidates in one query and marks findings with a single UPDATE - do_false_positive_history() now delegates to the batch function - post_process_findings_batch (import/reimport) calls the batch function instead of a per-finding loop - _bulk_update_finding_status_and_severity (bulk edit) groups findings by (product, dedup_alg) and calls the batch function once per group; retroactive reactivation also batched the same way - Fix dead-code bug in process_false_positive_history: the condition finding.false_p and not finding.false_p was always False because form.save(commit=False) mutates the finding in place; fixed by capturing old_false_p before the form save - Replace all per-finding save()/save_no_options() in FP history paths with QuerySet.update() (bypasses signals identically to the old calls) - Move all FP history helpers from dojo/utils.py to dojo/finding/deduplication.py alongside the matching dedupe helpers All update() calls carry a comment explaining the signal-bypass equivalence with the previous save(skip_validation=True) calls. Adds 4 unit tests covering: batch single-query behaviour, retroactive batch FP marking, retroactive reactivation (previously dead code), and the no-reactivation guard.
Limit _fetch_fp_candidates_for_batch to only the fields actually read from candidate objects (id, false_p, active, hash_code, unique_id_from_tool, title, severity), avoiding loading unused columns. Correct update() comments to clarify that .only() does not constrain QuerySet.update() — Django generates UPDATE SQL independently — so the sync requirement is only for fields *read* from candidate objects.
assertNumQueries(7) on both batch tests covers: System_Settings, 4 lazy-load chain (test/engagement/product/test_type from findings[0]), candidates SELECT with .only(), and the bulk UPDATE — fixed regardless of batch size or number of retroactively marked findings.
tejas0077 pushed a commit to tejas0077/django-DefectDojo that referenced this pull request
Mar 30, 2026…#14449) * perf(fp-history): batch false positive history processing Replaces the N+1 query pattern in false positive history with a single product-scoped DB query per batch, and switches per-finding save() calls to QuerySet.update() to eliminate redundant signal overhead. Changes: - Extract _fp_candidates_qs() as the single algorithm-dispatch helper shared by both single-finding and batch lookup paths - Add do_false_positive_history_batch() which fetches all FP candidates in one query and marks findings with a single UPDATE - do_false_positive_history() now delegates to the batch function - post_process_findings_batch (import/reimport) calls the batch function instead of a per-finding loop - _bulk_update_finding_status_and_severity (bulk edit) groups findings by (product, dedup_alg) and calls the batch function once per group; retroactive reactivation also batched the same way - Fix dead-code bug in process_false_positive_history: the condition finding.false_p and not finding.false_p was always False because form.save(commit=False) mutates the finding in place; fixed by capturing old_false_p before the form save - Replace all per-finding save()/save_no_options() in FP history paths with QuerySet.update() (bypasses signals identically to the old calls) - Move all FP history helpers from dojo/utils.py to dojo/finding/deduplication.py alongside the matching dedupe helpers All update() calls carry a comment explaining the signal-bypass equivalence with the previous save(skip_validation=True) calls. Adds 4 unit tests covering: batch single-query behaviour, retroactive batch FP marking, retroactive reactivation (previously dead code), and the no-reactivation guard. * perf(fp-history): add .only() to candidate fetch, fix update() comments Limit _fetch_fp_candidates_for_batch to only the fields actually read from candidate objects (id, false_p, active, hash_code, unique_id_from_tool, title, severity), avoiding loading unused columns. Correct update() comments to clarify that .only() does not constrain QuerySet.update() — Django generates UPDATE SQL independently — so the sync requirement is only for fields *read* from candidate objects. * test(fp-history): assert exact query count in batch tests assertNumQueries(7) on both batch tests covers: System_Settings, 4 lazy-load chain (test/engagement/product/test_type from findings[0]), candidates SELECT with .only(), and the bulk UPDATE — fixed regardless of batch size or number of retroactively marked findings. * test(fp-history): assert query count stays flat with N affected findings New test creates 5 pre-existing findings and asserts the batch still uses exactly 7 queries regardless — proving the old O(N) per-finding save loop is gone and a single bulk UPDATE covers all affected rows.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters