GitHub - npow/github-roast: Detect PR farming and rank OSS contributors for program admissions in minutes

Rank and vet OSS contributors in minutes — detects PR farming, analyzes real engagement, and gives you a scored report.

The problem

Evaluating GitHub profiles for GSoC, hiring, or program admissions takes hours per candidate. Applicants game it: submitting dozens of trivial PRs across unrelated repos in the weeks before a deadline inflates their apparent activity without demonstrating any real skill. Eyeballing PR counts and commit graphs doesn't catch this — and neither do automated tools that look at quantity, not quality.

Quick start

# Install
git clone https://github.com/npow/github-roast && cd github-roast && uv sync

# Analyze any GitHub user
github-roast torvalds

# Rank a labeled cohort in any repo
github-roast --repo owner/repo --label your-label --output report.md

Install

git clone https://github.com/npow/github-roast
cd github-roast
uv sync

Requires the gh CLI authenticated with a GitHub account (gh auth login) and agent-relay running locally.

Usage

Analyze any GitHub user — no repo required:

github-roast torvalds
github-roast torvalds --format json

Deep-dive with target repo — adds in-depth PR analysis for a specific repo:

github-roast npow --repo owner/repo

Bulk ranking — rank all contributors who opened a PR with a given label:

github-roast --repo owner/repo --label your-label --output report.md

Output includes a ranked table with merge rate, PRs/week, burst ratio, and trivial-PR rate — the key farming signals — followed by detailed profiles with actual PR discussion excerpts.

Web UI — browser-based interface with live progress streaming:

uv run uvicorn webapp:app --reload
# Visit http://localhost:8000

How it works

For each contributor, github-roast:

Fetches the last 90 days of public GitHub events (commits, PRs, reviews, comments)
Pulls their top repos and reads actual README content, file trees, and language stats
Samples up to 12 cross-repo PRs (one per org) and fetches the actual discussion threads
Computes farming signals: merge rate, PRs/week, 90-day burst ratio, trivial-PR rate, reviewer engagement
Runs a per-PR LLM classification on target-repo PRs (substantive vs. manufactured)
Generates a holistic LLM assessment that leads with the computed signals — preventing the LLM from being fooled by high PR counts or long account age

Results are cached in SQLite (6h for GitHub API calls, 24h for LLM results) so re-runs are fast.

Configuration

Env var	Default	Description
`ANTHROPIC_BASE_URL`	`http://localhost:18082`	Anthropic API endpoint
`MAINTAINERS`	``	Comma-separated usernames to exclude from cohort ranking

The gh CLI handles GitHub auth — no tokens to manage.

Development

git clone https://github.com/npow/github-roast
cd github-roast
uv sync
gh auth login   # if not already authenticated
github-roast youruser

License

Apache 2.0 — see LICENSE