'Unable to create dataflow job via dataflow console:Found unexpected parameters: ['key' (perhaps you meant 'region')]
i have created a dataflow job which accepts as input a parameter called 'key'. I have created the template with the following command
--template_location=gs://mm_dataflow_bucket/templates/marketstats_template
--staging_location=gs://mm_dataflow_bucket/staging --setup ./setup.py --key <my key>
I have then created the following metadata
{
"description": "An example pipeline that collect market stats.",
"name": "Market Stats",
"parameters": [
{
"name": "key",
"helpText": "IEX API",
"label": "IEXAPI"
}
]
}
However when i attempt to create the dataflow job via console i am getting back this
(cf16dcf1ac642eb2): The workflow could not be created. Causes: (cf16dcf1ac64261b): Found unexpected parameters: ['key' (perhaps you meant 'region')]
This seems to have happened since this week,as i have created templates in the past week which all accepts parameters and they all worked fine Anyone could asssist?
kind regards
Solution 1:[1]
My custom args class was extending the wrong class. I was extending StandardOptions where as I should have been extending PipelineOptions
Their arg parsing code seems like a giant confusing hack and I am very angry by it from loosing 6 hours of my life.
More details:
I had
class ReviewProcessingOptions(StandardOptions):
@classmethod def _add_argparse_args(cls, parser):
parser.add_value_provider_argument(
'--reports',
dest='reports',
help='report ids that are to be processed (comma separated)',
)
...
where I should have had
class ReviewProcessingOptions(PipelineOptions):
@classmethod
def _add_argparse_args(cls, parser):
parser.add_value_provider_argument(
'--reports',
dest='reports',
help='report ids that are to be processed (comma separated)',
)
...
I was using the runner param in my start script to do some logic and needed access to it. Thinking "ahh StandardOptions is a child of PipelineOptions i should be fine", I extended that class. But it does not work.
First thing to check is template file that is uploaded to storage (its a giant file of json). It should have the variables defined. The variables only started showing up once I extended PipelineOptions.
Here is a snippet from mine
{"key": "reports", "namespace": "apache_beam.options.pipeline_options.GoogleCloudOptions", "type": "STRING", "value": "RuntimeValueProvider(option: reports, type: str, default_value: None)"}
Solution 2:[2]
I understand that you are using "gcloud dataflow jobs run" command.
Here are all the flags allowed for the command.enter link description here.
There aren't any flags that resemble what you specified as "key". Can you please provide further explanation of which kind of parameters do you want to include in that flag
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | mlablablab |
| Solution 2 |
