'how can I decompress a 2gb+ file using python on a AWS lambda?

I am trying to decompress a file which is located on S3 from a lambda (Python). Everything works great until the original file is over 2GB, at that point, I only get "That compression method is not supported".

I have mainly tried with ZipFile and and ZipFile39 and no luck.... there are a couple of other packages but I got similar results.

When try to unzip the content using ZipFile

# Using ZipFile -- No problems here... (I have tried with/without 
# compression, compresslevel, allowZip64 as well as different values and 
# same output)
zip_content = zipfile.ZipFile(zip_content, 
                              'r', 
                              compression=8, # There is no 9 here
                              compresslevel=9,
                              allowZip64=True)
for filename in zip_content.namelist():
    print(zip_content.getinfo(filename))
    # Printing the zip_info I get: <ZipInfo filename='test_file.csv' 
    # compress_type=deflate64 external_attr=0x20 file_size=2505399449 
    # compress_size=853276056>

when try to unzip the content using ZipFile39

# Using ZipFile39 -- No problems here... (I have tried with/without compression,
# compresslevel, allowZip64 as well as different values and same output)
# But something interesting is that I cannot use 'ZIP_DEFLATED64' as it says that 
# attribute cannot be found on ZipFile39, but it is there (using 9 also fails).
zip_content = zipfile39.ZipFile(zip_content, 
                                'r',
                                compression=9, 
                                compresslevel=9, 
                                allowZip64=True)
for filename in zip_content.namelist():
    print(zip_content.getinfo(filename))
    # Printing the zip_info I get: <ZipInfo filename='test_file.csv' 
    # compress_type=deflate64 external_attr=0x20 file_size=2505399449 
    # compress_size=853276056>

The exception comes when trying:

# Writing to S3 <<---- HERE is where the exception occurs, always: 
# "That compression method is not supported"
# That is happening on the "open" zip_content.open (I have tried with the 
# force_zip64 and without it)
zip_content.open(zip_info, force_zip64=True)

I have seen other question about the same topic but I can't find an answer I can make it work so far, I have tried zipfile-deflate64 (got a cyclical reference error), stream_inflate (did not decompressed the file), stream_unzip (did not decompressed the file).

Few important notes:

  1. This is happening only on the AWS Lambda (locally works without issues - windows laptop).
  2. It is not about the memory/space on the lambda, it is not used more than 20% at this point.

Any idea/help/suggestion will be appreciated.

Thanks



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source