IMPORTANT: Developer documentation for the current main branch.
This content is unreleased and may change before the next Polaris release.
For stable user documentation, see the
latest release docs.
BigQuery Metastore Federation
Polaris can federate catalog operations to a BigQuery Metastore catalog. This lets BigQuery Metastore remain the source of truth for Iceberg table metadata while Polaris brokers access, policies, and multi-engine connectivity.
Build-time enablement🔗
The BigQuery factory is packaged as an optional extension and is not baked into default server
builds. Include it when assembling the runtime or container images by setting the NonRESTCatalogs
Gradle property to include BIGQUERY (and any other non-REST backends you need):
1./gradlew :polaris-server:assemble :polaris-server:quarkusAppPartsBuild --rerun \
2 -PNonRESTCatalogs=BIGQUERY -Dquarkus.container-image.build=true
runtime/server/build.gradle.kts wires the extension in only when this flag is present, so binaries
built without it will reject BigQuery federation requests.
Feature configuration🔗
After building Polaris with BigQuery Metastore support, enable the necessary feature flags in your
application.properties file (or equivalent configuration mechanism such as environment variables
or a Kubernetes ConfigMap):
1# Allows BIGQUERY connection type
2polaris.features."SUPPORTED_CATALOG_CONNECTION_TYPES"=["BIGQUERY"]
3
4# Allows IMPLICIT authentication, needed for BigQuery Metastore federation
5polaris.features."SUPPORTED_EXTERNAL_CATALOG_AUTHENTICATION_TYPES"=["IMPLICIT"]
6
7# Enables the federation feature itself
8polaris.features."ENABLE_CATALOG_FEDERATION"=true
For Kubernetes deployments, add these properties to the ConfigMap mounted into the Polaris
container (typically at /deployment/config/application.properties).
Runtime requirements🔗
- BigQuery API access: The Polaris deployment must be able to reach the BigQuery API
(
bigquery.googleapis.com) over HTTPS. - Authentication: BigQuery Metastore federation only supports
IMPLICITauthentication, meaning Polaris uses Google Application Default Credentials (ADC) resolved at process startup. ADC is resolved from theGOOGLE_APPLICATION_CREDENTIALSenvironment variable, an attached service account on GCP compute, or localgcloudcredentials during development. See Google’s Application Default Credentials documentation for details. Ensure valid credentials are available before starting the server. - IAM: The identity used by Polaris must have read access to BigQuery datasets in the target project and read access to the GCS warehouse bucket (read/write if Polaris will commit table metadata).
Creating a federated catalog🔗
Use the Management API to create an external catalog whose connection type is BIGQUERY. The
following request registers a catalog that proxies to BigQuery Metastore in the GCP project
my-gcp-project, using analytics_dataset as the default warehouse:
1curl -X POST https://<polaris-host>/management/v1/catalogs \
2 -H "Authorization: Bearer $TOKEN" \
3 -H "Content-Type: application/json" \
4 -d '{
5 "type": "EXTERNAL",
6 "name": "analytics_bigquery",
7 "storageConfigInfo": {
8 "storageType": "GCS"
9 },
10 "properties": {
11 "default-base-location": "gs://analytics-bucket/warehouse/"
12 },
13 "connectionConfigInfo": {
14 "connectionType": "BIGQUERY",
15 "properties": {
16 "gcp.bigquery.project-id": "my-gcp-project",
17 "warehouse": "analytics_dataset"
18 },
19 "authenticationParameters": {
20 "authenticationType": "IMPLICIT"
21 }
22 }
23 }'
The connectionConfigInfo.properties map carries BigQuery-specific configuration consumed by
Iceberg’s BigQueryProperties:
gcp.bigquery.project-id(required): the GCP project that owns the BigQuery datasets.warehouse(required): the BigQuery dataset name used as the default warehouse for new tables.gcp.bigquery.location(optional): the BigQuery location (region) of the datasets.gcp.bigquery.list-all-tables(optional): whentrue(the default),listTablesreturns every table in a BigQuery dataset regardless of type. Set tofalseto filter to BigQuery-Metastore Iceberg tables only.
The following optional properties let Polaris assume a different service account for BigQuery calls via service account impersonation:
gcp.bigquery.impersonate.service-account: target service account email.gcp.bigquery.impersonate.lifetime-seconds: lifetime of the impersonated credential.gcp.bigquery.impersonate.scopes: OAuth scopes for the impersonated credential.gcp.bigquery.impersonate.delegates: chain of delegate service accounts.
Limitations and operational notes🔗
- Single identity: Because only
IMPLICITauthentication is permitted, Polaris cannot mix multiple BigQuery identities in a single deployment (BigQueryMetastoreFederatedCatalogFactoryrejects other auth types). Plan a deployment topology that aligns the Polaris process identity with the target project. - Generic tables: The BigQuery extension exposes Iceberg tables registered in BigQuery
Metastore. Generic table federation is not implemented
(
BigQueryMetastoreFederatedCatalogFactory#createGenericCatalogthrowsUnsupportedOperationException). - Mixed table types in listings: By default
listTablesreturns every table in a BigQuery dataset, including non-Iceberg tables.loadTablewill return 404 for entries that are not Iceberg tables managed by BigQuery Metastore. Setgcp.bigquery.list-all-tables=falseto filter.