'how to load a large csv file to bigquery using pandas dataframe in cloud run?

I have csv files in gcs, I want to load them in bigquery, I'm using pandas to ingest files in bigquery, but these files are large (10gb), I use cloud run to execute the job:

     df=pd.read_csv(uri,sep=delimiter,dtype = str)

     # Run the load job
     load_job = client.load_table_from_dataframe(df, table)

I always get errors

    Memory limit of 512M exceeded with 519M used. Consider increasing the memory limit

how to choose the best memory to my cloud run, can I use chunck dataframe to load data to bigquery Thanks



Solution 1:[1]

The bad idea is to increase the Cloud Run memory. (It's not scalable)

The good idea is to use the BigQuery CSV import feature. And if you have transforms to perform on your data, you can run a query just after to perform them in SQL.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 guillaume blaquiere