'Apache beam for data validation, GCP
I have a requirement to validate the incoming csv data against the metadata which is coming from a bigquery table. I am using dataflow in local to do this, but I am unable to find out a way how can I use apache beam transform to implement this logic.
Note: At the moment, I saw some code where ParDo was getting used in which sideinput is given. Is this a right approach?
Validation required: Before loading the data to a bigquery table I have to check it against the metadata, checking if it passes the validation. I have to insert the data to BQ table.
Sample code:
"Process Data into TableRows" >> beam.ParDo(ProcessDF(),
beam.pvalue.AsList(metadata_for_validation))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
