Getting Started with Apache Polaris and Apache Flink

This getting started guide provides a docker-compose file to set up Apache Flink with Apache Polaris. Apache Polaris is configured as an Iceberg REST Catalog in Flink.

  1. Build the Polaris server image if it’s not already present locally:

    ./gradlew \
       :polaris-server:assemble \
       :polaris-server:quarkusAppPartsBuild --rerun \
       -Dquarkus.container-image.build=true
    
  2. Start the docker compose group by running the following command from the root of the repository:

    export S3_ENDPOINT=http://rustfs:9000
    docker compose -f site/content/guides/rustfs/docker-compose.yml -f site/content/guides/flink/docker-compose.yml up --build
    
  1. Open the Flink SQL client inside the running jobmanager container:

    docker exec -it $(docker ps -q --filter name=jobmanager) ./bin/sql-client.sh
    
  2. Register the Polaris catalog and run a few statements. The S3 endpoint and credentials point at the in-network rustfs service:

    CREATE CATALOG polaris WITH (
      'type'                 = 'iceberg',
      'catalog-impl'         = 'org.apache.iceberg.rest.RESTCatalog',
      'uri'                  = 'http://polaris:8181/api/catalog',
      'warehouse'            = 'quickstart_catalog',
      'credential'           = 'root:s3cr3t',
      'scope'                = 'PRINCIPAL_ROLE:ALL',
      'io-impl'              = 'org.apache.iceberg.aws.s3.S3FileIO',
      's3.endpoint'          = 'http://rustfs:9000',
      's3.path-style-access' = 'true',
      's3.access-key-id'     = 'rustfsadmin',
      's3.secret-access-key' = 'rustfsadmin'
    );
    
    USE CATALOG polaris;
    CREATE DATABASE ns1;
    USE ns1;
    
    CREATE TABLE table1 (id BIGINT, name STRING);
    
    SET 'execution.checkpointing.interval' = '10s';
    INSERT INTO table1 VALUES (1, 'a'), (2, 'b');
    
    SET 'sql-client.execution.result-mode' = 'tableau';
    SELECT * FROM table1;
    

Note: s3cr3t is defined as the password for the root user in the rustfs/docker-compose.yml file. rustfsadmin is the RustFS access key from the same file. Iceberg only makes data visible after a checkpoint commits, so allow ~10s between the INSERT and the SELECT.

Useful URLs🔗