The ML.PRINCIPAL_COMPONENTS function

This document describes the ML.PRINCIPAL_COMPONENTS function, which lets you see the principal components of a principal component analysis (PCA) model. Principal components and eigenvectors are the same concepts in PCA models.

Syntax

ML.PRINCIPAL_COMPONENTS(
  MODEL `PROJECT_ID.DATASET.MODEL`
)

Arguments

ML.PRINCIPAL_COMPONENTS takes the following arguments:

PROJECT_ID: your project ID.
DATASET: the BigQuery dataset that contains the model.
MODEL: the name of the model.

Output

ML.PRINCIPAL_COMPONENTS returns the following columns:

principal_component_id: an INT64 that contains the principal component ID.
feature: a STRING value that contains the feature column name.
numerical_value: a FLOAT64 value that contains the feature value for the principal component that principal_component_id identifies if the column identified by the feature value is numeric. Otherwise, numerical_value is NULL.
categorical_value: an ARRAY<STRUCT> value that contains information about categorical features. Each struct contains the following fields:
- categorical_value.category: a STRING value that contains the name of each category.
- categorical_value.value: a FLOAT64 value that contains the value of categorical_value.category for the principal component that principal_component_id identifies.

The output is in descending order by the eigenvalues of the principal components, which you can get by using the ML.PRINCIPAL_COMPONENT_INFO function.

Example

The following example retrieves the principal components from the model mydataset.mymodel in your default project:

SELECT
  *
FROM
  ML.PRINCIPAL_COMPONENTS(MODEL `mydataset.mymodel`)

What's next

For information about model weights support in BigQuery ML, see BigQuery ML model weights overview.
For information about the supported SQL statements and functions for each model type, see End-to-end user journey for each model.