A Rails application for syncing data from CommCare HQ to external PostgreSQL databases. This tool enables automated data forwarding and scheduled syncing of CommCare case data to destination databases with support for sensitive field hashing and custom table mappings.
Overview
CommCare Utils acts as a bridge between CommCare HQ and external databases, providing:
- Real-time data forwarding via webhooks from CommCare
- Scheduled batch syncing of case data from CommCare export tables
- Sensitive field protection through one-way hashing
- Dynamic schema management with automatic table and column creation
- Token-based API authentication for secure webhook endpoints
Getting Started
Prerequisites
- Ruby 3.4.3
- PostgreSQL
- CommCare HQ account with API access
Installation
- Clone the repository:
git clone <repository-url> cd commcare-utils
- Install dependencies:
- Set up the database:
bin/rails db:create bin/rails db:migrate
- Configure environment variables (create a
.envfile), see.env.examplefor required variables
Running Locally
For local development, use Procfile.dev which includes the web server, CSS compiler, and background worker:
This starts:
- web: Puma web server on port 3000
- css: DartSass CSS compiler in watch mode
- worker: GoodJob background worker for processing async jobs
Deployment
The application is configured for Heroku deployment using the Procfile:
The Procfile includes:
- web: Puma web server
- worker: GoodJob background worker
- release: Automatic database migrations on deploy
Data Models
Destination
Represents a target database and CommCare project configuration.
Key attributes:
name- Friendly name for the destinationproject_name- CommCare project namedatabase_url- PostgreSQL connection string (encrypted)commcare_username- CommCare API usernamecommcare_password- CommCare API key (encrypted)
Relationships:
has_many :destination_sources- Data sources to synchas_many :destination_tokens- API tokens for webhook authentication
Methods:
handle_forwarded_case(case_id)- Process a forwarded case from CommCare webhookcommcare_client- Returns configured CommCare API client
DestinationSource
Represents a specific case type or data export from CommCare that should be synced to a table.
Key attributes:
name- Friendly name for the sourcecase_type- CommCare case type (for case-based syncing)url- CommCare export URL for table-based syncingtable_name- Target table name in destination databasekey_column- Primary key column for upsertssensitive_fields- Comma-separated list of fields to hashscheduled_sync- Enable/disable scheduled batch syncing
Relationships:
belongs_to :destination
Methods:
sync_source- Sync data from CommCare export URL to destination tablesync_case(case_data, owner, opened_by_user, closed_by_user)- Sync a single case with metadata
Includes:
TableWritableconcern - Provides database writing and table management functionality
DestinationToken
API tokens for authenticating webhook requests from CommCare.
Key attributes:
token- UUID token (auto-generated)last_accessed_at- Timestamp of last use
Relationships:
belongs_to :destination
Methods:
self.authenticate(token_string)- Validate and return tokentouch_last_accessed- Update last access timestamp (rate-limited to 1 minute)
User
Application users authenticated via passwordless email login.
Key attributes:
email- User email address (unique)
Authentication:
Uses the passwordless gem for magic link email authentication.
Background Jobs
SyncSourceJob
Syncs a single DestinationSource from CommCare to the destination database.
Usage:
SyncSourceJob.perform_later(destination_source_id)
SyncAllSourcesJob
Queues sync jobs for all DestinationSource records with scheduled_sync: true.
Usage:
SyncAllSourcesJob.perform_later
Can be scheduled via cron or GoodJob's built-in scheduler.
DataForwardJob
Processes real-time case data forwarded from CommCare webhooks.
Usage:
DataForwardJob.perform_later(destination_token_id, payload)
Supports both XML and JSON payload formats from CommCare.
Core Components
TableWritable (Concern)
Provides database table management and data syncing functionality for DestinationSource.
Key features:
- Dynamic table and column creation
- Type inference and coercion
- Upsert operations with conflict resolution
- Sensitive field hashing (SHA256, truncated to 15 chars)
- Automatic cleanup of stale records
- Streaming HTML table parsing from CommCare exports
Methods:
ensure_table(columns, drop_columns)- Create/update table schemaupsert_row(columns, values, time)- Insert or update a rowdelete_rows_updated_before(time)- Remove stale recordsparse_source(&block)- Stream and parse CommCare export HTML
DatabaseWriter
Low-level PostgreSQL database operations for destination databases.
Features:
- Direct PostgreSQL connection management
- Table and column DDL operations
- Type mapping and conversion
- Parameterized queries for safety
Type mapping:
string→textinteger→integerfloat→decimaldatetime→timestamp without time zone
CommcareClient
HTTP client for CommCare HQ API.
Methods:
get_case(case_id)- Fetch case dataget_user(user_id)- Fetch user dataget_location(location_id)- Fetch location data
Architecture
Data Flow
-
Webhook Flow (Real-time):
- CommCare sends case update → Webhook endpoint
- Validates
DestinationToken - Queues
DataForwardJob - Job fetches full case data via API
- Syncs to destination database via
sync_case
-
Scheduled Sync Flow (Batch):
SyncAllSourcesJobruns on schedule- Queues
SyncSourceJobfor each enabled source - Job streams HTML export from CommCare
- Parses and upserts rows to destination database
- Removes stale records
Security Features
- Encrypted credentials (database URLs, CommCare passwords)
- Token-based webhook authentication
- Sensitive field hashing (one-way SHA256)
- Parameterized SQL queries
- HTTP success validation before parsing
Tech Stack
- Framework: Rails 8.1
- Ruby: 3.4.3
- Database: PostgreSQL
- Job Queue: GoodJob (database-backed)
- Authentication: Passwordless (magic link email)
- Frontend: Hotwire (Turbo + Stimulus), Bootstrap 5
- CSS: DartSass
- Deployment: Heroku, Kamal (Docker)