CloudQuery
Transformer Integrations
Transformer integrations are pre-load integrations that modify data as it flows from a source to a destination during a sync. They let you rename tables, remove sensitive columns, filter rows, flatten JSON, and more - without writing custom code or touching your source or destination configuration.
For an overview of how transformations fit into the broader CloudQuery pipeline (including post-load dbt transformations), see Transformations.
How Transformer Integrations Work
Transformers sit between a source and a destination in the sync pipeline:
- Data Interception: The transformer receives records after they leave the source integration.
- Transformation: It applies configured transformations to each record (schema changes, column modifications, row filtering, etc.).
- Data Delivery: The transformed records are sent to the destination integration for writing.
You wire a transformer into a destination by adding its name to the transformers list in the destination spec. The CLI chains them automatically.
Available Transformer Integrations
There are two official transformer integrations:
| Transformer | Description |
|---|---|
| Basic | Rename tables, add or remove columns, obfuscate sensitive data, normalize casing, drop rows by value, and add timestamps or literal columns. |
| JSON Flattener | Flatten single-level JSON object fields into individually typed destination columns while preserving the original JSON column. |
You can also browse all transformations (including post-load dbt packages) in the CloudQuery Hub.
Configuration
A transformer integration requires a kind: transformer spec in your configuration file, and the destination must reference it by name. Here’s a complete example that syncs AWS data through a Basic transformer into PostgreSQL:
kind: source
spec:
name: "aws"
path: "cloudquery/aws"
registry: "cloudquery"
version: "v33.18.0"
destinations: ["postgresql"]
tables: ["aws_s3_buckets", "aws_ec2_instances"]
---
kind: destination
spec:
name: "postgresql"
path: "cloudquery/postgresql"
registry: "cloudquery"
version: "v8.14.6"
write_mode: "overwrite-delete-stale"
migrate_mode: forced
transformers:
- "basic"
spec:
connection_string: "postgresql://user:password@localhost:5432/db_name"
---
kind: transformer
spec:
name: "basic"
path: "cloudquery/basic"
registry: "cloudquery"
version: "VERSION_TRANSFORMER_BASIC"
spec:
transformations:
- kind: obfuscate_sensitive_columns
- kind: change_table_names
tables: ["*"]
new_table_name_template: "cq_sync_{{.OldName}}"The destination references "basic" by name. The CLI matches this to the transformer spec with name: "basic" and routes records through it before writing.
If you plan to modify the schema from a previous sync (e.g. renaming tables or removing columns), set migrate_mode: forced on the destination so the schema is recreated.
Transformer Spec Reference
Available options for the transformer integration spec object:
name
(string, required)
Name of the integration. Must be unique if you have multiple transformer integrations.
The name is how destinations reference this transformer. For example, if you have two configs for the Basic integration transforming data differently for two destinations, you could name them basic-1 and basic-2. In this case, the path option below must be used to specify the download path for the integration.
registry
(string, optional, default: cloudquery, available: cloudquery, local, grpc, docker)
cloudquery: Download the integration from the official CloudQuery registry and execute it.local: Execute the integration from a local filesystem path.grpc: Connect to an already-running integration via gRPC (useful for debugging).docker: Run the integration as a Docker container.
path
(string, required)
Configures how to retrieve the integration. The format depends on the registry value.
- For
cloudqueryregistry:"cloudquery/<integration-name>"(e.g.cloudquery/basic). - For
localregistry: a filesystem path to the integration binary. - For
grpcregistry: the host and port of the running integration (e.g.localhost:7777).
version
(string, required)
Must be a valid SemVer, e.g. vMajor.Minor.Patch. You can find official integration versions on the GitHub releases page or on each integration’s hub page.
spec
(object, optional)
Integration-specific configuration. See the documentation for each transformer:
Common Use Cases
Renaming Tables
Add a prefix to all synced tables to avoid naming conflicts:
transformations:
- kind: change_table_names
tables: ["*"]
new_table_name_template: "cq_{{.OldName}}"Obfuscating Sensitive Data
Automatically redact all columns marked sensitive by the source, plus specific columns by name:
transformations:
- kind: obfuscate_sensitive_columns
- kind: obfuscate_columns
tables: ["aws_iam_users"]
columns: ["password_last_used"]Dropping Rows
Filter out rows where a column matches a specific value:
transformations:
- kind: drop_rows
tables: ["aws_ec2_instances"]
columns: ["region"]
value: "us-west-1"Removing Columns
Strip columns you don’t need in the destination:
transformations:
- kind: remove_columns
tables: ["aws_s3_buckets"]
columns: ["policy"]Transformations are applied sequentially. If you rename tables, subsequent transformations must use the new table names.
Creating Custom Transformers
Need a transformation that doesn’t exist? Learn how to create your own transformer integration.
Next Steps
- Source Integrations - Configure data extraction from cloud providers and APIs
- Destination Integrations - Configure where transformed data is stored
- Configuration Guide - Learn how to combine sources, transformers, and destinations
- Transformations (Policies) - Use dbt and SQL transformations for analytics