Read from Apache Iceberg to Dataflow
To read from Apache Iceberg to Dataflow, use the managed I/O connector.
Managed I/O supports the following capabilities for Apache Iceberg:
| Catalogs |
|
|---|---|
| Read capabilities | Batch read |
| Write capabilities |
|
For BigQuery tables for Apache Iceberg,
use the
BigQueryIO connector
with BigQuery Storage API. The table must already exist; dynamic table creation is
not supported.
Dependencies
Add the following dependencies to your project:
Java
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-java-managed</artifactId>
<version>${beam.version}</version>
</dependency>
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-java-io-iceberg</artifactId>
<version>${beam.version}</version>
</dependency>
Example
The following example reads from an Apache Iceberg table and writes the data to text files.