'Cannot upload large file to Google Cloud Storage
It is okay when dealing with small files. It doesn't work only when I try to upload large files. I'm using Python client. The snippet is:
filename='my_csv.csv'
storage_client = storage.Client()
bucket_name = os.environ["GOOGLE_STORAGE_BUCKET"]
bucket = storage_client.get_bucket(bucket_name)
blob = bucket.blob("{}".format(filename))
blob.upload_from_filename(filename) # file size is 500 MB
The only thing I get as a Traceback is "Killed" and I'm out of python interpreter.
Any suggestions are highly appriciated
Edit: It works okay from local machine. My application runs in Google Container Engine, so problems occurs there when runs in celery task.
Solution 1:[1]
upload_by_filename attempts to upload the entire file in a single request.
You can use Blob.chunk_size to spread the upload across many requests, each responsible for uploading one "chunk" of your file.
For example:
my_blob.chunk_size = 1024 * 1024 * 10
Solution 2:[2]
I kind of find the examples in the accepted answer a bit difficult to follow. (No doubt that it is very professionally written.)
The following made it much easier for me. Sharing in case it helps others as well..
from google.cloud import storage
def upload_file_to_gcp_bucket(service_account_json,
bucket_name,
file_to_upload,
file_name_in_gcp):
CHUNK_SIZE = 262144 # This needs to be a multiple of 262144
storage_client = storage.Client.from_service_account_json(service_account_json)
# create a bucket object
bucket = storage_client.get_bucket(bucket_name)
blob = bucket.blob(file_name_in_gcp, chunk_size=CHUNK_SIZE)
blob.upload_from_filename(file_to_upload)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | mepler |
| Solution 2 | edn |
