'Google Cloud Pub/Sub to BigQuery template topic subscription
When creating a new Dataflow Pub/Sub to BigQuery template it is possible to specify the PubSub topic. It appears there is no way to provide the existing PubSub subscription, instead Dataflow template creates a new subscription each time it runs.
As far as I understand the PubSub model, the only way to make sure we continue reading the data from the same place in topic is to reuse the same subscription and there seem to be no such option in here.
What will happen when user wants to re-deploy such a Dataflow template? Are we going to lose all the data between the deployments?
Solution 1:[1]
As an update, now there's a new template for this use very use case.
gcloud dataflow jobs run $jobname \
--project=$project \
--disable-public-ips \
--gcs-location gs://dataflow-templates-$location/latest/PubSub_Subscription_to_BigQuery \
--worker-machine-type n1-standard-1 \
--region $location \
--staging-location gs://$bucket/pss-to-bq \
--parameters inputSubscription=projects/$project/subscriptions/$subscription,outputTableSpec=$dataset.$table
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | slve |
