This document describes the data manipulation capabilities available with BigQuery DataFrames. You can find the functions that are described in the bigframes.bigquery library.

Required roles

To get the permissions that you need to complete the tasks in this document, ask your administrator to grant you the following IAM roles on your project:

For more information about granting roles, see Manage access to projects, folders, and organizations.

You might also be able to get the required permissions through custom roles or other predefined roles.

When you perform end user authentication in an interactive environment like a notebook, Python REPL, or the command line, BigQuery DataFrames prompts for authentication, if needed. Otherwise, see how to set up application default credentials for various environments.

pandas API

A notable feature of BigQuery DataFrames is that the bigframes.pandas API is designed to be similar to APIs in the pandas library. This design lets you employ familiar syntax patterns for data manipulation tasks. Operations defined through the BigQuery DataFrames API are executed server-side, operating directly on data stored within BigQuery and eliminating the need to transfer datasets out of BigQuery.

To check which pandas APIs are supported by BigQuery DataFrames, see Supported pandas APIs.

Inspect and manipulate data

You can use the bigframes.pandas API to perform data inspection and calculation operations. The following code sample uses the bigframes.pandas library to inspect the body_mass_g column, calculate the mean body_mass, and calculate the mean body_mass by species:

BigQuery library

The BigQuery library provides BigQuery SQL functions that might not have a pandas equivalent. The following sections present some examples.

Process array values

You can use the bigframes.bigquery.array_agg() function in the bigframes.bigquery library to aggregate values after a groupby operation:

You can also use the array_length() and array_to_string() array functions.

Create a struct Series object

You can use the bigframes.bigquery.struct() function in the bigframes.bigquery library to create a new struct Series object with subfields for each column in a DataFrame:

Convert timestamps to Unix epochs

You can use the bigframes.bigquery.unix_micros() function in the bigframes.bigquery library to convert timestamps into Unix microseconds:

You can also use the unix_seconds() and unix_millis() time functions.

Use the SQL scalar function

You can use the bigframes.bigquery.sql_scalar() function in the bigframes.bigquery library to access arbitrary SQL syntax representing a single-column expression:

What's next

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-03-30 UTC.