'Cannot upload large file to Google Cloud Storage

It is okay when dealing with small files. It doesn't work only when I try to upload large files. I'm using Python client. The snippet is:

filename='my_csv.csv'
storage_client = storage.Client()
bucket_name = os.environ["GOOGLE_STORAGE_BUCKET"]
bucket = storage_client.get_bucket(bucket_name)
blob = bucket.blob("{}".format(filename))
blob.upload_from_filename(filename)  # file size is 500 MB

The only thing I get as a Traceback is "Killed" and I'm out of python interpreter.

Any suggestions are highly appriciated

Edit: It works okay from local machine. My application runs in Google Container Engine, so problems occurs there when runs in celery task.



Solution 1:[1]

upload_by_filename attempts to upload the entire file in a single request.

You can use Blob.chunk_size to spread the upload across many requests, each responsible for uploading one "chunk" of your file.

For example:

my_blob.chunk_size = 1024 * 1024 * 10

Solution 2:[2]

I kind of find the examples in the accepted answer a bit difficult to follow. (No doubt that it is very professionally written.)

The following made it much easier for me. Sharing in case it helps others as well..

from google.cloud import storage


def upload_file_to_gcp_bucket(service_account_json, 
                              bucket_name, 
                              file_to_upload, 
                              file_name_in_gcp):
    CHUNK_SIZE = 262144 # This needs to be a multiple of 262144


storage_client = storage.Client.from_service_account_json(service_account_json)

# create a bucket object
bucket = storage_client.get_bucket(bucket_name)
blob = bucket.blob(file_name_in_gcp, chunk_size=CHUNK_SIZE)
blob.upload_from_filename(file_to_upload)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 mepler
Solution 2 edn