'how can I decompress a 2gb+ file using python on a AWS lambda?
I am trying to decompress a file which is located on S3 from a lambda (Python). Everything works great until the original file is over 2GB, at that point, I only get "That compression method is not supported".
I have mainly tried with ZipFile and and ZipFile39 and no luck.... there are a couple of other packages but I got similar results.
When try to unzip the content using ZipFile
# Using ZipFile -- No problems here... (I have tried with/without
# compression, compresslevel, allowZip64 as well as different values and
# same output)
zip_content = zipfile.ZipFile(zip_content,
'r',
compression=8, # There is no 9 here
compresslevel=9,
allowZip64=True)
for filename in zip_content.namelist():
print(zip_content.getinfo(filename))
# Printing the zip_info I get: <ZipInfo filename='test_file.csv'
# compress_type=deflate64 external_attr=0x20 file_size=2505399449
# compress_size=853276056>
when try to unzip the content using ZipFile39
# Using ZipFile39 -- No problems here... (I have tried with/without compression,
# compresslevel, allowZip64 as well as different values and same output)
# But something interesting is that I cannot use 'ZIP_DEFLATED64' as it says that
# attribute cannot be found on ZipFile39, but it is there (using 9 also fails).
zip_content = zipfile39.ZipFile(zip_content,
'r',
compression=9,
compresslevel=9,
allowZip64=True)
for filename in zip_content.namelist():
print(zip_content.getinfo(filename))
# Printing the zip_info I get: <ZipInfo filename='test_file.csv'
# compress_type=deflate64 external_attr=0x20 file_size=2505399449
# compress_size=853276056>
The exception comes when trying:
# Writing to S3 <<---- HERE is where the exception occurs, always:
# "That compression method is not supported"
# That is happening on the "open" zip_content.open (I have tried with the
# force_zip64 and without it)
zip_content.open(zip_info, force_zip64=True)
I have seen other question about the same topic but I can't find an answer I can make it work so far, I have tried zipfile-deflate64 (got a cyclical reference error), stream_inflate (did not decompressed the file), stream_unzip (did not decompressed the file).
Few important notes:
- This is happening only on the AWS Lambda (locally works without issues - windows laptop).
- It is not about the memory/space on the lambda, it is not used more than 20% at this point.
Any idea/help/suggestion will be appreciated.
Thanks
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
