Autoscaler
The Crow CI Autoscaler dynamically provisions cloud servers to execute pipelines, then terminates them when idle.
sequenceDiagram
participant Queue as Build Queue
participant AS as Autoscaler
participant Cloud as Cloud Provider
participant Agent as Agent (VM)
participant Server as Crow Server
Queue->>AS: Pending build
AS->>Cloud: Provision VM
Cloud->>Agent: VM ready
Agent->>Server: Register & connect
Agent->>Agent: Execute pipeline
Note over AS,Agent: Idle timeout
AS->>Cloud: Terminate VM
Supported Providers
| Provider | Configuration Reference |
|---|---|
| AWS | flags.go |
| Hetzner Cloud | flags.go |
| Linode | flags.go |
| Scaleway | flags.go |
| Vultr | flags.go |
Additional providers with a Go SDK can be added — contributions welcome!
-
Deploy alongside the server — the autoscaler listens for build triggers
-
Configure server connection — provide server address and authentication tokens
-
Configure scaling limits — set min/max agents and workflows per agent
-
Configure gRPC — remote agents need secure gRPC connection to server
-
Configure cloud provider — set provider credentials and instance settings
services:
crow-autoscaler:
image: codefloe.com/crowci/crow-autoscaler:<version>
restart: always
depends_on:
- crow-server
environment:
# Server connection
- CROW_SERVER=crow-server:9000
- CROW_TOKEN=${CROW_TOKEN} # Admin API token
- CROW_AUTOSCALER_TOKEN=${CROW_AUTOSCALER_TOKEN}
# Scaling limits
- CROW_MIN_AGENTS=0
- CROW_MAX_AGENTS=2
- CROW_WORKFLOWS_PER_AGENT=5
# gRPC (for remote agents)
- CROW_GRPC_ADDR=grpc.crow.example.com
- CROW_GRPC_SECURE=true
# Timeouts
- CROW_AGENT_IDLE_TIMEOUT=10m
- CROW_AGENT_SERVER_CONNECTION_TIMEOUT=10m
# Cloud provider (Hetzner example)
- CROW_PROVIDER=hetznercloud
- CROW_HETZNERCLOUD_API_TOKEN=${HETZNER_TOKEN}
- CROW_HETZNERCLOUD_LOCATION=fsn1
- CROW_HETZNERCLOUD_SERVER_TYPE=cax41
- CROW_HETZNERCLOUD_IMAGE=ubuntu-24.04
- CROW_HETZNERCLOUD_NETWORKS=my-network
- CROW_HETZNERCLOUD_SSH_KEYS=my-key
- CROW_HETZNERCLOUD_FIREWALLS=my-firewall
# Agent image (optional — auto-detected from server version if omitted)
# - CROW_AGENT_IMAGE=codefloe.com/crowci/crow-agent:v5.3.2
# Optional: agent environment
- CROW_AGENT_ENV=CROW_LOG_LEVEL=debug,CROW_HEALTHCHECK=false
Configuration Reference
Server Connection
| Variable | Description |
|---|---|
CROW_SERVER | Server address (internal or public URL) |
CROW_TOKEN | Admin API token for agent management |
CROW_AUTOSCALER_TOKEN | Registration token for autoscaler |
| Variable | Default | Description |
|---|---|---|
CROW_MIN_AGENTS | 0 | Minimum agents always running |
CROW_MAX_AGENTS | 1 | Maximum concurrent agents |
CROW_WORKFLOWS_PER_AGENT | 1 | Parallel workflows per agent |
| Variable | Default | Description |
|---|---|---|
CROW_AGENT_IDLE_TIMEOUT | 10m | Time before idle agent is terminated |
CROW_AGENT_SERVER_CONNECTION_TIMEOUT | 10m | Max time without server connection |
Remote agents require secure gRPC to connect back to the server.
| Variable | Description |
|---|---|
CROW_GRPC_ADDR | Public gRPC address (no protocol prefix) |
CROW_GRPC_SECURE | Set true for TLS connection |
| Variable | Default | Description |
|---|---|---|
CROW_AGENT_IMAGE | auto | Container image for spawned agents |
When CROW_AGENT_IMAGE is not set, the autoscaler queries the Crow server’s /version endpoint and uses the matching agent image automatically — for example, if the server reports version v5.3.2, the autoscaler uses codefloe.com/crowci/crow-agent:v5.3.2.
Set this variable explicitly only if you need to pin a specific agent version or use a custom image.
Agent Configuration
| Variable | Default | Description |
|---|---|---|
CROW_AGENT_ENV | none | Environment variables passed to spawned agents (comma-separated KEY=value pairs) |
CROW_FILTER_LABELS | none | Only count queued tasks matching this label (key=value) toward scaling decisions. Required for multiple autoscalers. |
Example agent environment:
CROW_AGENT_ENV=CROW_AGENT_LABELS=tier=heavy,CROW_LOG_LEVEL=debug,CROW_HEALTHCHECK=false
Remote agents need a TLS-secured gRPC endpoint. Configure your reverse proxy to forward to the server’s gRPC port (default: 9000).
server {
listen 443 ssl http2;
server_name grpc.crow.example.com;
ssl_certificate /etc/ssl/certs/crow.crt;
ssl_certificate_key /etc/ssl/private/crow.key;
location / {
grpc_pass grpc://crow-server:9000;
}
}
grpc.crow.example.com {
reverse_proxy h2c://crow-server:9000
}
http:
routers:
crow-grpc:
rule: Host(`grpc.crow.example.com`)
service: crow-server
tls:
certResolver: letsencrypt
services:
crow-server:
loadBalancer:
servers:
- url: h2c://crow-server:9000
Combine static agents (always-on) with autoscaled agents (on-demand) for cost efficiency.
| Agent Type | Use Case |
|---|---|
| Static | Fast, lightweight builds; always available |
| Autoscaled | Resource-intensive builds; cost-optimized |
Example: Run a small static agent alongside the server for quick jobs. The autoscaler provisions powerful VMs only when the static agent is at capacity.
Use labels to route workflows:
Static agent configuration:
CROW_AGENT_LABELS=tier=standard
Workflow targeting autoscaled agents (.crow.yaml):
The autoscaler checks for available agents before provisioning. If a static agent can handle the workload, no new VM is created.
Multiple Autoscalers
A single Crow server can use multiple autoscalers simultaneously. Each autoscaler runs as an independent process with its own registration token, provider configuration, and scaling limits.
Why Use Multiple Autoscalers
Multiple autoscalers let you target different cloud providers from a single server, for example, Hetzner for Linux builds and Azure for Windows builds.
You can also provision different instance sizes, using small VMs for unit tests and large VMs for integration tests.
Multi-region setups are possible too, placing agents in eu-west for European teams and us-east for US teams.
For cost optimization, non-urgent work can run on spot or preemptible instances while time-sensitive builds use on-demand capacity.
Finally, you can serve different architectures by provisioning amd64 agents from one provider and arm64 agents from another.
Each autoscaler reports its capabilities to the server via heartbeat. The server uses two mechanisms to route workflows to the right autoscaler:
-
Agent labels (
CROW_AGENT_LABELSinsideCROW_AGENT_ENV) — the autoscaler reports these to the server, which uses them to determine whether the autoscaler can provision agents for a given workflow. A workflow’slabels:must match an autoscaler’s reported labels for that autoscaler to handle it. -
Filter labels (
CROW_FILTER_LABELS) — the autoscaler uses these locally to decide which queued tasks count toward its scaling decisions. Without this, every autoscaler would see all pending tasks and try to scale up for work meant for a different autoscaler.
Example: Dual-Provider Setup
Register two autoscalers on the server and note their tokens.
services:
# Small instances for standard builds
autoscaler-standard:
image: codefloe.com/crowci/crow-autoscaler:<version>
restart: always
environment:
- CROW_SERVER=crow-server:9000
- CROW_TOKEN=${CROW_TOKEN}
- CROW_AUTOSCALER_TOKEN=${AUTOSCALER_TOKEN_STANDARD}
- CROW_GRPC_ADDR=grpc.crow.example.com
- CROW_GRPC_SECURE=true
- CROW_MAX_AGENTS=4
- CROW_WORKFLOWS_PER_AGENT=3
- CROW_FILTER_LABELS=tier=standard
- CROW_AGENT_ENV=CROW_AGENT_LABELS=tier=standard
- CROW_PROVIDER=hetznercloud
- CROW_HETZNERCLOUD_API_TOKEN=${HETZNER_TOKEN}
- CROW_HETZNERCLOUD_SERVER_TYPE=cax21
# ... other Hetzner settings
# Large instances for heavy builds
autoscaler-heavy:
image: codefloe.com/crowci/crow-autoscaler:<version>
restart: always
environment:
- CROW_SERVER=crow-server:9000
- CROW_TOKEN=${CROW_TOKEN}
- CROW_AUTOSCALER_TOKEN=${AUTOSCALER_TOKEN_HEAVY}
- CROW_GRPC_ADDR=grpc.crow.example.com
- CROW_GRPC_SECURE=true
- CROW_MAX_AGENTS=2
- CROW_WORKFLOWS_PER_AGENT=1
- CROW_FILTER_LABELS=tier=heavy
- CROW_AGENT_ENV=CROW_AGENT_LABELS=tier=heavy
- CROW_PROVIDER=aws
- CROW_AWS_INSTANCE_TYPE=c5.2xlarge
# ... other AWS settings
Workflows select their tier with labels:
# .crow.yaml — lightweight job
labels:
tier: standard
steps:
- name: lint
image: golangci/golangci-lint
commands:
- golangci-lint run
# .crow.yaml — resource-intensive job
labels:
tier: heavy
steps:
- name: integration
image: golang
commands:
- go test -race -count=1 ./...