-
-
Notifications
You must be signed in to change notification settings - Fork 62
feat(postgres): add partitioned table support #297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat(postgres): add partitioned table support #297
Conversation
e65c749
to
230f109
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few nits:
- We generally use only lowercase queries (this is a standard we adopt at Supabase).
- Lets first merge the PR related to PG 14 support and then rebase it on this one to have a clean diff.
etl/src/replication/client.rs
Outdated
"select schemaname, tablename from pg_publication_tables where pubname = {};", | ||
"select pt.schemaname, pt.tablename from pg_publication_tables pt | ||
join pg_class c on c.relname = pt.tablename | ||
join pg_namespace n on n.oid = c.relnamespace AND n.nspname = pt.schemaname |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there the need to perform this JOIN
? Seems like we don't really need to validate namespaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There wasn't, no. I think this was leftover code that I included attempting to fix the race issue I encountered. I cannot replicate it anymore.
4670481
to
c983ec9
Compare
This MR has mostly been rewritten, as I discovered there wasn't an easy way to capture new partitions being created. I did discover |
Tests are broken again on the partial branch. Can be solved by this patch: uni-intelligence@c0ec81f |
96cc8de
to
6522d79
Compare
67cc277
to
4893b04
Compare
4893b04
to
6d5132b
Compare
Signed-off-by: Abhi Agarwal <abhi@airspace-intelligence.com>
Signed-off-by: Abhi Agarwal <abhi@airspace-intelligence.com>
Signed-off-by: Abhi Agarwal <abhi@airspace-intelligence.com>
Signed-off-by: Abhi Agarwal <abhi@airspace-intelligence.com>
Signed-off-by: Abhi Agarwal <abhi@airspace-intelligence.com>
bf3af88
to
3524085
Compare
Signed-off-by: Abhi Agarwal <abhi@airspace-intelligence.com>
Signed-off-by: Abhi Agarwal <abhi@airspace-intelligence.com>
Signed-off-by: Abhi Agarwal <abhi@airspace-intelligence.com>
9b70c19
to
82436de
Compare
This is ready for review! Note that I did cherry-pick #361 on top of this since I was running non-bigquery tests open, but happy to revert if desired. |
Thanks for the PR, once I have time, I will go through it. We are currently busy with other work so it might take some time. |
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting
pub_tables as ( | ||
select r.prrelid as oid | ||
from pg_publication_rel r | ||
join pg_publication p on p.oid = r.prpubid | ||
where p.pubname = {pub} and (select has from has_rel) | ||
union all | ||
select c.oid | ||
from pg_publication_tables pt | ||
join pg_class c on c.relname = pt.tablename | ||
where pt.pubname = {pub} and not (select has from has_rel) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[P1] Join publication tables without schema
The new recursive query in PgReplicationClient::get_publication_table_ids
now joins pg_publication_tables
to pg_class
only on relname
. If a publication includes two tables with the same name in different schemas (e.g. sales.users
and archive.users
), this join can resolve to the wrong OID and the client will subscribe to or copy data from the wrong table. The previous implementation joined on both schema and table name and avoided this ambiguity.
Useful? React with 👍 / 👎.
What kind of change does this PR introduce?
Adds support for partitioned tables. It does this through a very convoluted sql query that I basically guess-and-checked until it worked. It basically uses a bunch of heuristics to confirm that if a table is a child table of a partitioned table, it's allowed to treat their PKs as its own.
What is the current behavior?
Fixes #296
What is the new behavior?
Partitioned tables are treated as a single table, and replicate as one unit.
Additional context
Note that this is stacked on top of my other MR because I found this bug after fixing that bug. I will also note that I happened to find a race condition in this PR where a replicate worker tries to push events while a sync worker is waiting for a schema. I solved this by simply blocking all workers until all schemas are fixed, but this is a suboptimal solution, to say the least.