Computing numerical and categorical statistics

You can use Sensitive Data Protection to compute numerical and categorical numerical statistics for individual columns in BigQuery tables. Sensitive Data Protection can calculate the following:

  • The column's minimum value
  • The column's maximum value
  • Quantile values for the column
  • A histogram of value frequencies in the column

Compute numerical statistics

You can determine minimum, maximum, and quantile values for an individual BigQuery column. To calculate these values, you configure a DlpJob, setting the NumericalStatsConfig privacy metric to the name of the column to scan. When you run the job, Sensitive Data Protection computes statistics for the given column, returning its results in the NumericalStatsResult object. Sensitive Data Protection can compute statistics for the following number types:

  • integer
  • float
  • date
  • datetime
  • timestamp
  • time

The statistics that a scan run returns include the minimum value, the maximum value, and 99 quantile values that partition the set of field values into 100 equal sized buckets.

Code examples

Following is sample code in several languages that demonstrates how to use Sensitive Data Protection to calculate numerical statistics.

C#

To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.

To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

Go

To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.

To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

Java

To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.

To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

Node.js

To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.

To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

PHP

To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.

To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

Python

To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.

To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

You can compute categorical numerical statistics for the individual histogram buckets within a BigQuery column, including:

  • Upper bound on value frequency within a given bucket
  • Lower bound on value frequency within a given bucket
  • Size of a given bucket
  • A sample of value frequencies within a given bucket (maximum 20)

To calculate these values, you configure a DlpJob, setting the CategoricalStatsConfig privacy metric to the name of the column to scan. When you run the job, Sensitive Data Protection computes statistics for the given column, returning its results in the CategoricalStatsResult object.

Code examples

Following is sample code in several languages that demonstrates how to use Sensitive Data Protection to calculate categorical statistics.

C#

To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.

To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

Go

To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.

To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

Java

To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.

To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

Node.js

To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.

To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

PHP

To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.

To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

Python

To learn how to install and use the client library for Sensitive Data Protection, see Sensitive Data Protection client libraries.

To authenticate to Sensitive Data Protection, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.