BigQuery DataFrames (BigFrames)#
BigQuery DataFrames (also known as BigFrames) provides a Pythonic DataFrame and machine learning (ML) API powered by the BigQuery engine. It provides modules for many use cases, including:
bigframes.pandas is a pandas API for analytics. Many workloads can be migrated from pandas to bigframes by just changing a few imports.
bigframes.ml is a scikit-learn-like API for ML.
bigframes.bigquery.ai are a collection of powerful AI methods, powered by Gemini.
BigQuery DataFrames is an open-source package.
Getting started with BigQuery DataFrames#
The easiest way to get started is to try the BigFrames quickstart in a notebook in BigQuery Studio.
To use BigFrames in your local development environment,
Run
pip install --upgrade bigframesto install the latest version.Setup Application default credentials for your local development environment enviroment.
Use the
bigframespackage to query data.
import bigframes.pandas as bpd bpd.options.bigquery.project = your_gcp_project_id # Optional in BQ Studio. bpd.options.bigquery.ordering_mode = "partial" # Recommended for performance. df = bpd.read_gbq("bigquery-public-data.usa_names.usa_1910_2013") print( df.groupby("name") .agg({"number": "sum"}) .sort_values("number", ascending=False) .head(10) .to_pandas() )
Documentation#
To learn more about BigQuery DataFrames, visit these pages
License#
BigQuery DataFrames is distributed with the Apache-2.0 license.
It also contains code derived from the following third-party packages:
For details, see the third_party directory.
Contact Us#
For further help and provide feedback, you can email us at bigframes-feedback@google.com.
DataFrames
Data Types
Generative AI
- AI Functions
- AI Functions for Poster Analysis
- AI Forecast
- Use BigQuery DataFrames with Generative AI for code generation
- Define the LLM model
- Read data from Cloud Storage into BigQuery DataFrames
- Generate code using the LLM model
- Manipulate LLM output using a remote function
- Save the results to Cloud Storage
- Summary and next steps
- Use BigQuery DataFrames to cluster and characterize complaints
- Summary and next steps
- LLM Output Schema
- Build a Vector Search application using BigQuery DataFrames (aka BigFrames)
- Summary and next steps
- BigQuery DataFrames ML: Drug Name Generation
- Bulk generation
- Large Language Models
Machine Learning
- ML Cross Validation
- Train a linear regression model with BigQuery DataFrames ML
- Summary and next steps
- Train a linear regression model with BigQuery DataFrames ML
- Compatibility with
bigframes.ml - Summary and next steps
- Train a linear regression model with BigQuery DataFrames ML
- Summary and next steps
- Easy Linear Regression
- Sklearn Linear Regression
- Timeseries Analysis
Visualization
Remote Functions
- Remote Function
- Set Up
- Notes
- Self-contained function
- Function referring to variables outside the function body
- Function referring to imports (built-in) outside the function body
- Function referring to another function outside the function body
- Function requiring external packages
- Function referring to imports (third-party) outside the function body
- Clean Up
- Use BigQuery DataFrames to run Anthropic LLM at scale
- Overview
- Set Up
- Initialize BigQuery DataFrames dataframe
- Use BigQuery DataFrames
remote_function - Clean Up
Kaggle
- BigQuery DataFrames (BigFrames) AI Forecast
- 4. Process the raw result and draw a line plot along with the training data
- Describe product images with BigFrames multimodal DataFrames
- 2. Combine unstructured data with structured data
- 3. Conduct image transformations
- 4. Use LLM models to ask questions and generate embeddings on images
- Vector Search Over National Jukebox