'how to load a large csv file to bigquery using pandas dataframe in cloud run?
I have csv files in gcs, I want to load them in bigquery, I'm using pandas to ingest files in bigquery, but these files are large (10gb), I use cloud run to execute the job:
df=pd.read_csv(uri,sep=delimiter,dtype = str)
# Run the load job
load_job = client.load_table_from_dataframe(df, table)
I always get errors
Memory limit of 512M exceeded with 519M used. Consider increasing the memory limit
how to choose the best memory to my cloud run, can I use chunck dataframe to load data to bigquery Thanks
Solution 1:[1]
The bad idea is to increase the Cloud Run memory. (It's not scalable)
The good idea is to use the BigQuery CSV import feature. And if you have transforms to perform on your data, you can run a query just after to perform them in SQL.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | guillaume blaquiere |
