'Creating a template for DataFlow throws the error AttributeError: 'RuntimeValueProvider' object has no attribute 'tableId'
I have made a pipeline with Apache Beam which runs successfully with the DataFlow runner. I'm trying to create a template, but when using a RuntimeValueProvider for the apache_beam.io.gcp.bigquery.WriteToBigQuery transformer, the following error is thrown:
AttributeError: 'RuntimeValueProvider' object has no attribute 'tableId'.
The code (excerpt) looks as follows:
class ProcessOptions(PipelineOptions):
@classmethod
def _add_argparse_args(cls, parser):
parser.add_value_provider_argument(
'--output_gcs',
dest='output_gcs',
default='gs://nlp-text-classification/results/stackoverflow_template',
type=str,
required=False,
help='Google Cloud Storage Path.')
parser.add_value_provider_argument(
'--output_bq_table',
dest='output_bq_table',
default='nlp-text-classification:stackoverflow.template_test',
type=str,
required=False,
help='BigQuery table.')
process_options = options.view_as(ProcessOptions)
with beam.Pipeline(options=options) as p:
[...]
"Write Posts to BigQuery" >> beam.io.WriteToBigQuery(table=process_options.output_bq_table,
schema=table_schema)
[...]
Is this a bug or am I doing something wrong?
Solution 1:[1]
Instead of 'Write To BigQuery' >> beam.io.Write(beam.io.BigQuerySink(..)) I used beam.io.WriteToBigQuery(..) and also used --experiment=use_beam_bq_sink which worked for me and I am not getting the below error anymore AttributeError: 'RuntimeValueProvider' object has no attribute 'datasetId
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Sohinee Saha |
