'From Postgres CDC to BigQuery with materialized views

I'm designing a Change Data Capture from our Postgres to BigQuery.

I came across something that seems a powerful combination of:

My intuition tells me that by combining these two features, I should be able to have always up to date, incremental synced tables, with the least amount of engineering.

However, the lack of documentation (or suggesting different approaches) from the Google documentation, makes me skeptical that the above approach would work: e.g. from google docs:

So yes, the idea would be having a materialized view such as

given a table composed by the columns (id, version, json_valeus, soft_delete)

CREATE Materialized view SELECT * FROM myTable WHERE concat(id, version) in (SELECT concat(id, max(version)) FROM myTable GROUP BY id)

Would this be extremely inefficient/not work?

System size:

  • about 1000 tables,
  • with 1M to 500M rows each.
  • 10-50M updates/day total across all tables

Expected cost:

  • 5-15k/month


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source