CloudQuery

Transformer Integrations

Transformer integrations are pre-load integrations that modify data as it flows from a source to a destination during a sync. They let you rename tables, remove sensitive columns, filter rows, flatten JSON, and more - without writing custom code or touching your source or destination configuration.

For an overview of how transformations fit into the broader CloudQuery pipeline (including post-load dbt transformations), see Transformations.

How Transformer Integrations Work

Transformers sit between a source and a destination in the sync pipeline:

  1. Data Interception: The transformer receives records after they leave the source integration.
  2. Transformation: It applies configured transformations to each record (schema changes, column modifications, row filtering, etc.).
  3. Data Delivery: The transformed records are sent to the destination integration for writing.

You wire a transformer into a destination by adding its name to the transformers list in the destination spec. The CLI chains them automatically.

Available Transformer Integrations

There are two official transformer integrations:

TransformerDescription
BasicRename tables, add or remove columns, obfuscate sensitive data, normalize casing, drop rows by value, and add timestamps or literal columns.
JSON FlattenerFlatten single-level JSON object fields into individually typed destination columns while preserving the original JSON column.

You can also browse all transformations (including post-load dbt packages) in the CloudQuery Hub.

Configuration

A transformer integration requires a kind: transformer spec in your configuration file, and the destination must reference it by name. Here’s a complete example that syncs AWS data through a Basic transformer into PostgreSQL:

kind: source
spec:
  name: "aws"
  path: "cloudquery/aws"
  registry: "cloudquery"
  version: "v33.18.0"
  destinations: ["postgresql"]
  tables: ["aws_s3_buckets", "aws_ec2_instances"]
---
kind: destination
spec:
  name: "postgresql"
  path: "cloudquery/postgresql"
  registry: "cloudquery"
  version: "v8.14.6"
  write_mode: "overwrite-delete-stale"
  migrate_mode: forced
  transformers:
    - "basic"
  spec:
    connection_string: "postgresql://user:password@localhost:5432/db_name"
---
kind: transformer
spec:
  name: "basic"
  path: "cloudquery/basic"
  registry: "cloudquery"
  version: "VERSION_TRANSFORMER_BASIC"
  spec:
    transformations:
      - kind: obfuscate_sensitive_columns
      - kind: change_table_names
        tables: ["*"]
        new_table_name_template: "cq_sync_{{.OldName}}"

The destination references "basic" by name. The CLI matches this to the transformer spec with name: "basic" and routes records through it before writing.

If you plan to modify the schema from a previous sync (e.g. renaming tables or removing columns), set migrate_mode: forced on the destination so the schema is recreated.

Transformer Spec Reference

Available options for the transformer integration spec object:

name

(string, required)

Name of the integration. Must be unique if you have multiple transformer integrations.

The name is how destinations reference this transformer. For example, if you have two configs for the Basic integration transforming data differently for two destinations, you could name them basic-1 and basic-2. In this case, the path option below must be used to specify the download path for the integration.

registry

(string, optional, default: cloudquery, available: cloudquery, local, grpc, docker)

  • cloudquery: Download the integration from the official CloudQuery registry and execute it.
  • local: Execute the integration from a local filesystem path.
  • grpc: Connect to an already-running integration via gRPC (useful for debugging).
  • docker: Run the integration as a Docker container.

path

(string, required)

Configures how to retrieve the integration. The format depends on the registry value.

  • For cloudquery registry: "cloudquery/<integration-name>" (e.g. cloudquery/basic).
  • For local registry: a filesystem path to the integration binary.
  • For grpc registry: the host and port of the running integration (e.g. localhost:7777).

version

(string, required)

Must be a valid SemVer, e.g. vMajor.Minor.Patch. You can find official integration versions on the GitHub releases page or on each integration’s hub page.

spec

(object, optional)

Integration-specific configuration. See the documentation for each transformer:

Common Use Cases

Renaming Tables

Add a prefix to all synced tables to avoid naming conflicts:

transformations:
  - kind: change_table_names
    tables: ["*"]
    new_table_name_template: "cq_{{.OldName}}"

Obfuscating Sensitive Data

Automatically redact all columns marked sensitive by the source, plus specific columns by name:

transformations:
  - kind: obfuscate_sensitive_columns
  - kind: obfuscate_columns
    tables: ["aws_iam_users"]
    columns: ["password_last_used"]

Dropping Rows

Filter out rows where a column matches a specific value:

transformations:
  - kind: drop_rows
    tables: ["aws_ec2_instances"]
    columns: ["region"]
    value: "us-west-1"

Removing Columns

Strip columns you don’t need in the destination:

transformations:
  - kind: remove_columns
    tables: ["aws_s3_buckets"]
    columns: ["policy"]

Transformations are applied sequentially. If you rename tables, subsequent transformations must use the new table names.

Creating Custom Transformers

Need a transformation that doesn’t exist? Learn how to create your own transformer integration.

Next Steps