Getting Started with Apache Polaris and MinIO
⚠️ Warning
Disclaimer: This guide uses MinIO OSS for local testing only. MinIO OSS is in maintenance mode, and MinIO container images may no longer receive updates or security fixes.Overview🔗
This example uses MinIO as a storage provider with Polaris.
Spark is used as a query engine. This example assumes a local Spark installation. See the Spark Notebooks Example for a more advanced Spark setup.
Starting the Example🔗
Build the Polaris server image if it’s not already present locally:
1./gradlew \ 2 :polaris-server:assemble \ 3 :polaris-server:quarkusAppPartsBuild --rerun \ 4 -Dquarkus.container-image.build=trueStart the docker compose group by running the following command from the root of the repository:
1docker compose -f site/content/guides/minio/docker-compose.yml up
Connecting From Spark🔗
1bin/spark-sql \
2 --packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.10.1,org.apache.iceberg:iceberg-aws-bundle:1.10.1 \
3 --conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions \
4 --conf spark.sql.catalog.polaris=org.apache.iceberg.spark.SparkCatalog \
5 --conf spark.sql.catalog.polaris.type=rest \
6 --conf spark.sql.catalog.polaris.uri=http://localhost:8181/api/catalog \
7 --conf spark.sql.catalog.polaris.token-refresh-enabled=false \
8 --conf spark.sql.catalog.polaris.warehouse=quickstart_catalog \
9 --conf spark.sql.catalog.polaris.scope=PRINCIPAL_ROLE:ALL \
10 --conf spark.sql.catalog.polaris.header.X-Iceberg-Access-Delegation=vended-credentials \
11 --conf spark.sql.catalog.polaris.credential=root:s3cr3t \
12 --conf spark.sql.catalog.polaris.client.region=irrelevant
Note: s3cr3t is defined as the password for the root user in the docker-compose.yml file.
Note: The client.region configuration is required for the AWS S3 client to work, but it is not used in this example
since MinIO does not require a specific region.
Running Queries🔗
Run inside the Spark SQL shell:
1USE polaris;
2
3CREATE NAMESPACE ns;
4
5CREATE TABLE ns.t1 AS SELECT 'abc';
6
7SELECT * FROM ns.t1;
8-- abc
MinIO Endpoints🔗
Note that the catalog configuration defined in the docker-compose.yml contains
different endpoints for the Polaris Server and the client (Spark). Specifically,
the client endpoint is http://localhost:9000, but endpointInternal is http://minio:9000.
This is necessary because clients running on localhost do not normally see service
names (such as minio) that are internal to the docker compose environment.