Implement intelligent test selection in CI by Copilot · Pull Request #3046 · DeepLabCut/DeepLabCut

Copilot AI changed the title [WIP] Optimize test suite runtime based on PR changes Implement intelligent test selection system to reduce CI runtime to ~5 minutes

Jul 5, 2025

Copilot stopped work on behalf of MMathisLab due to an error

July 5, 2025 21:09

MMathisLab

Stop forcing a full test run when more than two non-doc categories are matched. The early escalation (setting lanes.full, adding reasons, and returning a SelectorResult) was removed so selection will continue to aggregate pytest paths and functional scripts instead. Also add comments to test_selector_config.py documenting the structure and purpose of CATEGORY_RULES.
Migrate selector tests to the new lane-based API: replace Plan enum assertions with res.lanes.<lane> boolean checks, and update reasons to use the "category:<name>" format. Add assertions for lane_reasons and provenance mappings where appropriate, verify minimal pytest mapping for CI workflows, and ensure pytest/functional outputs remain deduplicated and sorted. Rename/clarify some tests and update comments to reflect the new behavior; no production logic changes, only test updates.
Remove legacy example testscript.py and testscript_multianimal.py and switch to engine-specific example names (testscript_tensorflow_single_animal.py, testscript_tensorflow_multi_animal.py, etc.). Update GitHub Actions workflow to run the renamed example scripts. Adjust documentation to reference the new TensorFlow example names and comment out the old TestScript outputs section. Update tests and test selector config: change expected functional script paths in tests, and reorganize CATEGORY_RULES to add separate categories for TensorFlow, PyTorch, 3D, and related functional scripts.
Also undo problematic gitignore selections excluding functional tests scripts

@C-Achard C-Achard changed the title Implement intelligent test selection for github actions Implement intelligent test selection in CI

Mar 13, 2026

@deruyter92

Introduce a validated CategoryRule pydantic model and validation helpers in tools/test_selector_config.py, convert CATEGORY_RULES to a list of CategoryRule instances, and expose CATEGORY_RULE_BY_NAME. Add path/field validators for rule names, matchers, pytest paths and functional scripts. Update tools/test_selector.py to import CATEGORY_RULE_BY_NAME and to access rule data via attributes (e.g. .name, .match_any, .pytest_paths, .functional_scripts) instead of dict lookups, and remove the file-local CATEGORY_RULE_BY_NAME construction. This refactors rule representation for stronger validation and safer attribute access throughout the selector logic.
Introduce an assert_lanes helper to DRY repeated lane assertions and add imports (Path, ValidationError). Replace inline lane checks across selector tests with the helper for clarity. Add extensive tests validating CategoryRule and CATEGORY_RULES: pydantic validation for empty/invalid names, empty or non-callable matchers, invalid repo-relative paths, duplicate rule names, and cross-rule/config sanity checks (typed models, unique names, required rules present, matchers exist). Also verify that pytest_paths and functional_scripts referenced in CATEGORY_RULES exist in the repository. These changes improve test maintainability and ensure selector configuration correctness.

@C-Achard

@C-Achard

@C-Achard

@C-Achard C-Achard marked this pull request as ready for review

March 16, 2026 09:49

@C-Achard

Introduce _details_open(summary, add_blank=True) and _details_close() helpers and replace repeated literal "<details>..." / "</details>" strings in _render_decision_markdown with those helpers. This reduces duplication, clarifies structure, and keeps blank-line handling consistent; no change to rendered content.

@C-Achard

@C-Achard

Revamp the tools/README.md section for the test selector: replace the single `plan` concept with orthogonal workflow `lanes` (skip, docs, fast, full) and document emitted fields (lanes, pytest_paths, functional_scripts, provenance, selected_workflows, lane_reasons, diff_mode, schema_version). Add design principles (fail-safe, deterministic, validated, auditable), explain rule configuration and predicate helpers, enumerate diff modes and report artifacts, show CLI examples (including --write-summary and manual --base-sha/--head-sha), and add testing and troubleshooting guidance. These changes clarify routing behavior, auditing, and how to debug unexpected full-suite selections.
Update GitHub Actions workflows to use actions/upload-artifact@v6. Changed the upload-artifact version in .github/workflows/build-book.yml (Upload built site artifact) and .github/workflows/intelligent-testing.yml (Upload selector report) to pick up the newer action release.
Delete the "Design principles" subsection from tools/README.md, removing bullets describing Fail-safe, Deterministic, Strictly validated, and Auditable. This change cleans up outdated or redundant documentation and does not modify code or behavior.

deruyter92

Update GitHub Actions workflow to install editable package with the "tf" extras (pip install -e ".[tf]" --group dev). Improve the targeted pytest step by adding an explicit check for empty PYTEST_PATHS: if no paths are selected the job prints a message and exits successfully instead of running pytest across the repo. Also includes minor formatting/whitespace cleanup in the step script.
Add three tests to tests/tools/test_selector/test_selector_decision.py to cover selection behavior:

- test_lint_only_changes_select_skip_lane: ensures lint-only changes select the skip lane and produce no pytest or functional script selections.
- test_validate_selected_paths_escalates_to_full_on_missing: verifies missing selected pytest/script files cause escalation from fast to full, clear selections, and record missing-path reasons.
- test_validate_selected_paths_keeps_fast_when_paths_exist: ensures fast lane remains when selected paths are present.

Also add a file header comment.
Remove strict check that res.functional_scripts == [] in tests/tools/test_selector/test_selector_decision.py and replace it with a commented note. The functional_scripts list can be non-empty and the test does not need to specify exact scripts, so this prevents brittle failures while keeping other expectations intact.
Update test selector config to broaden category coverage: add extra pytest paths and functional scripts to superanimal_modelzoo (including a new modelzoo test and an enabled example testscript plus commented TODOs), add additional pytest files for multianimal, reorganize example scripts between categories, and introduce a dedicated pose_estimation_tensorflow CategoryRule with several pytest paths (dataset augmentation, imgaug, predict, evaluate) and its functional scripts. Also include minor formatting/whitespace adjustments and TODO comments for follow-up cleanup.
Relax strict equality checks in tests/tools/test_selector/test_selector_decision.py: replace assertions that expected exactly ['multianimal'] with membership assertions that check 'multianimal' is present. This makes the test resilient to additional provenance labels or ordering changes.
Update references to the new TensorFlow test scripts across docs and examples. CONTRIBUTING.md now mentions testscript_tensorflow_single_animal.py and testscript_tensorflow_multi_animal.py. examples/test.sh runs the tensorflow single- and multi-animal test scripts. examples/testscript_3d.py and examples/testscript_pretrained_models.py were updated to reference the new single-animal testscript for error messages and basepath resolution. This aligns documentation and example runners with the renamed TensorFlow test scripts.