-
Notifications
You must be signed in to change notification settings - Fork 5.5k
Open
Labels
Description
Expected Behavior or Use Case
Add a $snapshot_id
hidden column to Iceberg tables that exposes the snapshot ID as a column, enabling snapshot-based filtering.
-- Filter by specific snapshot
SELECT * FROM my_table WHERE "$snapshot_id" = 1234567890;
-- Join with snapshot metadata
SELECT t.*, s.committed_at
FROM my_table t
JOIN "my_table$snapshots" s ON t."$snapshot_id" = s.snapshot_id;
Presto Component, Service, or Connector
Iceberg connector
Possible Implementation
Follow the existing pattern for $path
:
- Add
SNAPSHOT_ID
to IcebergMetadataColumn - Populate in
IcebergPageSourceProvider
from split's snapshot ID - Implement as file-level metadata column (constant per split)
- (Not done for
$path
) Push down inIcebergMetadata#getTableLayoutForConstraint
for predicates like$snapshot_id >= X
(this maps toFOR VERSION AS OF (X)
). Improve further as needed.
Example Screenshots (if appropriate):
Context
This would be a hidden column which is not documented, as it could lead to strange results if a user supplied it. It would be used by an optimizer to enable optimization techniques which can leverage Iceberg's incremental scans--by creating a virtual column, a predicate can be created on the column.
PingLiuPing