Build an Apache Iceberg Lakehouse with BigLake
Build your Apache Iceberg lakehouse with BigLake
Build an open, managed and high-performance Iceberg lakehouse to enable advanced analytics and data science, with automated data management and built-in governance.
Features
Interoperable across transactional and analytical data
BigLake metastore is a serverless metastore for all your Iceberg tables. Engines like Apache Spark, BigQuery, and third party platforms can use it to create and manage tables, giving you a consistent view of your data and unified access controls. BigLake Metastore now supports the Apache Iceberg Rest Catalog for easy integration with OSS and third party engines. Iceberg tables are also data accessible in AlloyDB (Preview), to gain interoperability between transactional and analytical platforms.
Unified data management and governance
BigLake extends Google Cloud Storage management capabilities, enabling you to use storage auto-class for efficient cold data tiering and apply customer-managed encryption keys (CMEK) to your storage buckets. BigLake metastore natively integrates into Dataplex Universal Catalog, ensuring that governance policies defined centrally are consistently enforced across multiple engines while enabling semantic search, data lineage, profiling, and quality checks.
High performance analytics, streaming and AI with BigQuery
BigLake tables for Apache Iceberg offer an enterprise-ready, fully managed Iceberg experience when used with BigQuery. By storing Apache Iceberg data in your own Google Cloud Storage buckets and leveraging BigQuery's highly scalable, real time metadata management capabilities, you get the best of both worlds - the openness and data ownership associated with GCS as well as access to BigQuery's fully managed capabilities with Iceberg data for streaming, advance analytics and AI use cases.
How It Works
BigLake offers a native implementation for Apache Iceberg on Cloud Storage where you can leverage BigQuery or your choice of open source engines directly on Iceberg data. BigLake Metastore helps simplify data management and integrates with Dataplex Universal Catalog for unified governance.
Common Uses
Build an open lakehouse with Iceberg
Understanding the Google Cloud components of an open data lakehouse
To build an Iceberg lakehouse with BigLake, start by storing your data in Cloud Storage. Then, define this data using BigLake tables for Apache Iceberg. The BigLake metastore serves as your centralized, serverless catalog for these Iceberg tables, eliminating the need to manage complex infrastructure. This setup allows any Iceberg-compatible engine to consistently access and manage your data, creating a unified, open, and scalable lakehouse environment with ease.
Tutorials, quickstarts, & labs
Understanding the Google Cloud components of an open data lakehouse
To build an Iceberg lakehouse with BigLake, start by storing your data in Cloud Storage. Then, define this data using BigLake tables for Apache Iceberg. The BigLake metastore serves as your centralized, serverless catalog for these Iceberg tables, eliminating the need to manage complex infrastructure. This setup allows any Iceberg-compatible engine to consistently access and manage your data, creating a unified, open, and scalable lakehouse environment with ease.
Advanced analytics with BigQuery
Provide real-time insights and predictions for financial services
You can use Apache Iceberg for evolving data lake datasets like transactions or market feeds. BigLake enables BigQuery to then query Iceberg tables alongside native storage without data movement. You can ingest real-time streams into BigQuery, combining with historical Iceberg data via BigLake for immediate, comprehensive analysis. BigQuery ML then generates real-time insights like market volatility and fraud detection as well as predictive models like credit risk and customer behavior.
Tutorials, quickstarts, & labs
Provide real-time insights and predictions for financial services
You can use Apache Iceberg for evolving data lake datasets like transactions or market feeds. BigLake enables BigQuery to then query Iceberg tables alongside native storage without data movement. You can ingest real-time streams into BigQuery, combining with historical Iceberg data via BigLake for immediate, comprehensive analysis. BigQuery ML then generates real-time insights like market volatility and fraud detection as well as predictive models like credit risk and customer behavior.
Enabling all data users on a single copy of data
BigLake provides secure, consistent access to a single copy of data in Cloud Storage. Dataplex Universal Catalog then automatically catalogs this data so all data users and engines can access. This ensures consistent data definitions, easy discovery, and unified governance, eliminating silos and fostering collaboration on one source of truth.
Tutorials, quickstarts, & labs
BigLake provides secure, consistent access to a single copy of data in Cloud Storage. Dataplex Universal Catalog then automatically catalogs this data so all data users and engines can access. This ensures consistent data definitions, easy discovery, and unified governance, eliminating silos and fostering collaboration on one source of truth.
Generate a solution
What problem are you trying to solve?
What you'll get:
Step-by-step guide
Reference architecture
Available pre-built solutions
This service was built with Vertex AI. You must be 18 or older to use it. Do not enter sensitive, confidential, or personal info.
Pricing
| How BigLake pricing works | BigLake pricing is based on table management, metadata storage and metadata access | |
|---|---|---|
| Services and usage | Description | Price (USD) |
BigLake table management | BigLake table management compute resources used for automatic table storage optimization. | Starting at $0.12 per DCU-Hour |
BigLake metadata storage | BigLake Metastore charges for metadata stored. Free tier includes 1 GiB of metadata storage per month included. | Starting at $0.04 per GiB per month |
BigLake metadata access | Class A Operations: BigLake metadata access charges for writes, updates, list, create, and config operations with a free tier of 5,000 operations per month included. | Starting at $6.00 per million operations |
Class B Operations: BigLake metadata access charges for reads, get, and delete operations with a free tier of 50,000 operations per month included. | Starting at $0.90 per million operations | |
How BigLake pricing works
BigLake pricing is based on table management, metadata storage and metadata access
Description
BigLake table management compute resources used for automatic table storage optimization.
Price (USD)
Starting at
$0.12
per DCU-Hour
Description
BigLake Metastore charges for metadata stored. Free tier includes 1 GiB of metadata storage per month included.
Price (USD)
Starting at
$0.04
per GiB per month
Description
Class A Operations: BigLake metadata access charges for writes, updates, list, create, and config operations with a free tier of 5,000 operations per month included.
Price (USD)
Starting at
$6.00
per million operations
Class B Operations: BigLake metadata access charges for reads, get, and delete operations with a free tier of 50,000 operations per month included.
Description
Starting at
$0.90
per million operations
Pricing calculator
Estimate your monthly BigLake costs, including region specific pricing and fees.
Custom quote
Connect with our sales team to get a custom quote for your organization.
Start your proof of concept
Lakehouse jumpstart solution
Have a large project?
BigLake tables for Apache Iceberg
Use the Apache Iceberg REST catalog
Query Apache Iceberg data