Read from Apache Iceberg to Dataflow

To read from Apache Iceberg to Dataflow, use the managed I/O connector.

Managed I/O supports the following capabilities for Apache Iceberg:

Catalogs
  • Hadoop
  • Hive
  • REST-based catalogs
  • BigQuery metastore (requires Apache Beam SDK 2.62.0 or later if not using Runner v2)
Read capabilities Batch read
Write capabilities

For BigQuery tables for Apache Iceberg, use the BigQueryIO connector with BigQuery Storage API. The table must already exist; dynamic table creation is not supported.

Dependencies

Add the following dependencies to your project:

Java

<dependency>
  <groupId>org.apache.beam</groupId>
  <artifactId>beam-sdks-java-managed</artifactId>
  <version>${beam.version}</version>
</dependency>

<dependency>
  <groupId>org.apache.beam</groupId>
  <artifactId>beam-sdks-java-io-iceberg</artifactId>
  <version>${beam.version}</version>
</dependency>

Example

The following example reads from an Apache Iceberg table and writes the data to text files.

What's next

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-18 UTC.