Getting Started with Apache Polaris and Apache Flink

This getting started guide provides a docker-compose file to set up Apache Flink with Apache Polaris. Apache Polaris is configured as an Iceberg REST Catalog in Flink.

  1. Build the Polaris server image if it’s not already present locally:

    1./gradlew \
    2   :polaris-server:assemble \
    3   :polaris-server:quarkusAppPartsBuild --rerun \
    4   -Dquarkus.container-image.build=true
    
  2. Start the docker compose group by running the following command from the root of the repository:

    1export S3_ENDPOINT=http://rustfs:9000
    2docker compose -f site/content/guides/rustfs/docker-compose.yml -f site/content/guides/flink/docker-compose.yml up --build
    
  1. Open the Flink SQL client inside the running jobmanager container:

    1docker exec -it $(docker ps -q --filter name=jobmanager) ./bin/sql-client.sh
    
  2. Register the Polaris catalog and run a few statements. The S3 endpoint and credentials point at the in-network rustfs service:

     1CREATE CATALOG polaris WITH (
     2  'type'                 = 'iceberg',
     3  'catalog-impl'         = 'org.apache.iceberg.rest.RESTCatalog',
     4  'uri'                  = 'http://polaris:8181/api/catalog',
     5  'warehouse'            = 'quickstart_catalog',
     6  'credential'           = 'root:s3cr3t',
     7  'scope'                = 'PRINCIPAL_ROLE:ALL',
     8  'io-impl'              = 'org.apache.iceberg.aws.s3.S3FileIO',
     9  's3.endpoint'          = 'http://rustfs:9000',
    10  's3.path-style-access' = 'true',
    11  's3.access-key-id'     = 'rustfsadmin',
    12  's3.secret-access-key' = 'rustfsadmin'
    13);
    14
    15USE CATALOG polaris;
    16CREATE DATABASE ns1;
    17USE ns1;
    18
    19CREATE TABLE table1 (id BIGINT, name STRING);
    20
    21SET 'execution.checkpointing.interval' = '10s';
    22INSERT INTO table1 VALUES (1, 'a'), (2, 'b');
    23
    24SET 'sql-client.execution.result-mode' = 'tableau';
    25SELECT * FROM table1;
    

Note: s3cr3t is defined as the password for the root user in the rustfs/docker-compose.yml file. rustfsadmin is the RustFS access key from the same file. Iceberg only makes data visible after a checkpoint commits, so allow ~10s between the INSERT and the SELECT.

Useful URLs🔗