Skip to content

Add $snapshot_id hidden column to Iceberg connector #26164

@tdcmeehan

Description

@tdcmeehan

Expected Behavior or Use Case

Add a $snapshot_id hidden column to Iceberg tables that exposes the snapshot ID as a column, enabling snapshot-based filtering.

  -- Filter by specific snapshot
  SELECT * FROM my_table WHERE "$snapshot_id" = 1234567890;

  -- Join with snapshot metadata
  SELECT t.*, s.committed_at
  FROM my_table t
  JOIN "my_table$snapshots" s ON t."$snapshot_id" = s.snapshot_id;

Presto Component, Service, or Connector

Iceberg connector

Possible Implementation

Follow the existing pattern for $path:

  1. Add SNAPSHOT_ID to IcebergMetadataColumn
  2. Populate in IcebergPageSourceProvider from split's snapshot ID
  3. Implement as file-level metadata column (constant per split)
  4. (Not done for $path) Push down in IcebergMetadata#getTableLayoutForConstraint for predicates like $snapshot_id >= X (this maps to FOR VERSION AS OF (X)). Improve further as needed.

Example Screenshots (if appropriate):

Context

This would be a hidden column which is not documented, as it could lead to strange results if a user supplied it. It would be used by an optimizer to enable optimization techniques which can leverage Iceberg's incremental scans--by creating a virtual column, a predicate can be created on the column.

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions