'gzip a file in Python
I want to gzip a file in Python. I am trying to use the subprocss.check_call(), but it keeps failing with the error 'OSError: [Errno 2] No such file or directory'. Is there a problem with what I am trying here? Is there a better way to gzip a file than using subprocess.check_call?
from subprocess import check_call
def gZipFile(fullFilePath)
check_call('gzip ' + fullFilePath)
Thanks!!
Solution 1:[1]
There is a module gzip. Usage:
Example of how to create a compressed GZIP file:
import gzip
content = b"Lots of content here"
f = gzip.open('/home/joe/file.txt.gz', 'wb')
f.write(content)
f.close()
Example of how to GZIP compress an existing file:
import gzip
f_in = open('/home/joe/file.txt')
f_out = gzip.open('/home/joe/file.txt.gz', 'wb')
f_out.writelines(f_in)
f_out.close()
f_in.close()
EDIT:
Jace Browning's answer using with in Python >= 2.7 is obviously more terse and readable, so my second snippet would (and should) look like:
import gzip
with open('/home/joe/file.txt', 'rb') as f_in, gzip.open('/home/joe/file.txt.gz', 'wb') as f_out:
f_out.writelines(f_in)
Solution 2:[2]
Read the original file in binary (rb) mode and then use gzip.open to create the gzip file that you can write to like a normal file using writelines:
import gzip
with open("path/to/file", 'rb') as orig_file:
with gzip.open("path/to/file.gz", 'wb') as zipped_file:
zipped_file.writelines(orig_file)
Even shorter, you can combine the with statements on one line:
with open('path/to/file', 'rb') as src, gzip.open('path/to/file.gz', 'wb') as dst:
dst.writelines(src)
Solution 3:[3]
From the docs for Python3
Gzip an existing file
import gzip
import shutil
with open('file.txt', 'rb') as f_in:
with gzip.open('file.txt.gz', 'wb') as f_out:
shutil.copyfileobj(f_in, f_out)
Or if you hate nested with statements
import gzip
import shutil
from contextlib import ExitStack
with ExitStack() as stack:
f_in = stack.enter_context(open('file.txt', 'rb'))
f_out = stack.enter_context(gzip.open('file.txt.gz', 'wb'))
shutil.copyfileobj(f_in, f_out)
Create a new gzip file:
import gzip
content = b"Lots of content here"
with gzip.open("file.txt.gz", "wb") as f:
f.write(content)
Note the fact that content is turned into bytes
Another method for if you aren't creating content as a string/byte literal like the above example would be
import gzip
# get content as a string from somewhere else in the code
with gzip.open("file.txt.gz", "wb") as f:
f.write(content.encode("utf-8"))
See this SO question for a discussion of other encoding methods.
Solution 4:[4]
Use the gzip module:
import gzip
import os
in_file = "somefile.data"
in_data = open(in_file, "rb").read()
out_gz = "foo.gz"
gzf = gzip.open(out_gz, "wb")
gzf.write(in_data)
gzf.close()
# If you want to delete the original file after the gzip is done:
os.unlink(in_file)
Your error: OSError: [Errno 2] No such file or directory' is telling you that the file fullFilePath does not exist. If you still need to go that route, please make sure that file exists on your system and you are using an absolute path not relative.
Solution 5:[5]
the documentation on this is actually insanely straightforward
Example of how to read a compressed file:
import gzip
f = gzip.open('file.txt.gz', 'rb')
file_content = f.read()
f.close()
Example of how to create a compressed GZIP file:
import gzip
content = "Lots of content here"
f = gzip.open('file.txt.gz', 'wb')
f.write(content)
f.close()
Example of how to GZIP compress an existing file:
import gzip
f_in = open('file.txt', 'rb')
f_out = gzip.open('file.txt.gz', 'wb')
f_out.writelines(f_in)
f_out.close()
f_in.close()
https://docs.python.org/2/library/gzip.html
That's the whole documentation . . .
Solution 6:[6]
import gzip
def gzip_file(src_path, dst_path):
with open(src_path, 'rb') as src, gzip.open(dst_path, 'wb') as dst:
for chunk in iter(lambda: src.read(4096), b""):
dst.write(chunk)
Solution 7:[7]
For windows subprocess can be used to run 7za utility: from https://www.7-zip.org/download.html download 7-Zip Extra: standalone console version, 7z DLL, Plugin for Far Manager compact takes all csv files inside gzip directory and compress each one to gzip format. Original files are deleted. 7z options can be found in https://sevenzip.osdn.jp/chm/cmdline/index.htm
import os
from pathlib import Path
import subprocess
def compact(cspath, tec, extn, prgm): # compress each extn file in tec dir to gzip format
xlspath = cspath / tec # tec location
for baself in xlspath.glob('*.' + str(extn)): # file iteration inside directory
source = str(baself)
target = str(baself) + '.gz'
try:
subprocess.call(prgm + " a -tgzip \"" + target + "\" \"" + source + "\" -mx=5")
os.remove(baself) # remove src xls file
except:
print("Error while deleting file : ", baself)
return
exe = "C:\\7za\\7za.exe" # 7za.exe (a = alone) is a standalone version of 7-Zip
csvpath = Path('C:/xml/baseline/') # working directory
compact(csvpath, 'gzip', 'csv', exe) # xpress each csv file in gzip dir to gzip format
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | Boris Verkhovskiy |
| Solution 3 | |
| Solution 4 | |
| Solution 5 | O.rka |
| Solution 6 | mechatroner |
| Solution 7 | GERMAN RODRIGUEZ |
