Implement fast bazel target lookup with index caching by titusfortner · Pull Request #16974 · SeleniumHQ/selenium

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
Shell command injection

Description: The bazel:affected_targets task builds a shell command using backticks with
base_rev/head_rev derived from the user-supplied commit_range (git diff --name-only
#{base_rev} #{head_rev}), which can enable command injection if a malicious range is
provided (e.g., including shell metacharacters).
Rakefile [1658-1669]

Referred Code
task :affected_targets, %i[commit_range index_file] do |_task, args|
  range = args[:commit_range] || 'HEAD'
  index_file = args[:index_file] || 'build/bazel-test-target-index.json'
  base_rev, head_rev = if range.include?('..')
                         range.split('..', 2)
                       else
                         ["#{range}^", range]
                       end

  puts "Commit range: #{base_rev}..#{head_rev}"

  changed_files = `git diff --name-only #{base_rev} #{head_rev}`.split("\n").map(&:strip).reject(&:empty?)
Bazel query injection

Description: Bazel query strings are constructed by interpolating the repository file path (via rel)
into attr(srcs, '#{rel}', ...), which could permit Bazel query injection or query breakage
if a crafted filename contains quotes/special characters (e.g., foo' ) + kind(".*",
//...), especially when filenames are sourced from git diff.
Rakefile [1793-1806]

Referred Code
query = test_files.filter_map { |f|
  pkg = find_bazel_package(f)
  next if pkg.nil?

  # Bazel srcs often use paths relative to the package, not basenames.
  rel = f.sub(%r{\A#{Regexp.escape(pkg)}/}, '')
  "attr(srcs, '#{rel}', //#{pkg}:*)"
}.join(' + ')
return [] if query.empty?

targets = []
Bazel.execute('query', ['--output=label'], query) do |out|
  targets = out.lines.map(&:strip).select { |l| l.start_with?('//') }
end
Cache path traversal

Description: The workflow uses inputs.cache-name directly to form a filesystem path (${{
inputs.cache-name }}.gz) for cache restore, which could allow unintended file
overwrite/path traversal within the workspace if an untrusted caller supplies a value like
../somefile to a workflow_call-exposed input.
bazel.yml [108-114]

Referred Code
- name: Restore cache
  if: inputs.cache-name != ''
  uses: actions/cache/restore@v4
  with:
    path: ${{ inputs.cache-name }}.gz
    key: ${{ inputs.cache-name }}-
    restore-keys: ${{ inputs.cache-name }}-
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

🔴
Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Unhandled failures: The new tasks do not robustly handle failures from git/file/JSON/Bazel operations (e.g.,
missing/invalid refs or malformed index) and will raise or silently proceed without
actionable error context.

Referred Code
task :affected_targets, %i[commit_range index_file] do |_task, args|
  range = args[:commit_range] || 'HEAD'
  index_file = args[:index_file] || 'build/bazel-test-target-index.json'
  base_rev, head_rev = if range.include?('..')
                         range.split('..', 2)
                       else
                         ["#{range}^", range]
                       end

  puts "Commit range: #{base_rev}..#{head_rev}"

  changed_files = `git diff --name-only #{base_rev} #{head_rev}`.split("\n").map(&:strip).reject(&:empty?)
  puts "Changed files: #{changed_files.size}"

  targets = if index_file && File.exist?(index_file)
              affected_targets_with_index(changed_files, index_file)
            else
              puts 'No index found, using directory-based fallback'
              affected_targets_fallback(changed_files)
            end



 ... (clipped 130 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status:
Command injection risk: The commit_range argument is interpolated directly into a backtick shell command (git diff
--name-only #{base_rev} #{head_rev}) without validation/escaping, allowing potential
command injection if an attacker can influence the task arguments.

Referred Code
task :affected_targets, %i[commit_range index_file] do |_task, args|
  range = args[:commit_range] || 'HEAD'
  index_file = args[:index_file] || 'build/bazel-test-target-index.json'
  base_rev, head_rev = if range.include?('..')
                         range.split('..', 2)
                       else
                         ["#{range}^", range]
                       end

  puts "Commit range: #{base_rev}..#{head_rev}"

  changed_files = `git diff --name-only #{base_rev} #{head_rev}`.split("\n").map(&:strip).reject(&:empty?)
  puts "Changed files: #{changed_files.size}"

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status:
Unstructured logging: The new tasks emit multiple free-form puts logs (including target labels and file counts)
rather than structured logs, and it is unclear from the diff whether this output could
include sensitive repo/internal path information in some environments.

Referred Code
  puts "Commit range: #{base_rev}..#{head_rev}"

  changed_files = `git diff --name-only #{base_rev} #{head_rev}`.split("\n").map(&:strip).reject(&:empty?)
  puts "Changed files: #{changed_files.size}"

  targets = if index_file && File.exist?(index_file)
              affected_targets_with_index(changed_files, index_file)
            else
              puts 'No index found, using directory-based fallback'
              affected_targets_fallback(changed_files)
            end

  if targets.empty?
    puts 'No test targets affected'
    File.write('bazel-targets.txt', '')
  else
    puts "Found #{targets.size} affected test targets"
    File.write('bazel-targets.txt', targets.sort.join(' '))
    targets.sort.each { |t| puts t }
  end
end


 ... (clipped 42 lines)

Learn more about managing compliance generic rules or creating your own custom rules

  • Update
Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label