'How can I avoid zlib "unexpected end of file" when GUnzipping partial files?
I'm trying to read part of a gzipped file while decompressing it so I can parse the header contents without reading uneccessary bytes. I had this working previously using fs.read() while passing options to only read the first 500 bytes then using zlib.gunzip() to decompress the contents before parsing the header from the binary data.
This was working fine until node v5.0.0 patched a bug to ensure zlib throws an error on a truncated input (https://github.com/nodejs/node/pull/2595).
Now I'm getting the following error from zlib.
Error: unexpected end of file
How can I unzip this partial file knowing that I'm truncating the input without throwing an error. I was thinking it might be easier with streams so I wrote the following.
var readStream = fs.createReadStream(file.path, {start: 0, end: 500});
var gunzip = zlib.createGunzip();
readStream.pipe(gunzip)
.on('data', function(chunk) {
console.log(parseBinaryHeader(chunk));
console.log('got %d bytes of data', chunk.length);
})
.on('error', function (err) {
console.log(err);
})
.on('end', function() {
console.log('end');
});
My parseBinaryHeader() function is returning the correct header content so I know it's unzipping but it's still throwing an error when it hits the end of the input. I can add the error listener to handle the error and do nothing with it, but this doesn't seem ideal.
Any ideas?
Solution 1:[1]
Thanks for all the suggestions. I also submitted a question issue to the node repository and got some good feedback. Here's what ended up working for me.
- Set the chunk size to the full header size.
- Write the single chunk to the decompress stream and immediately pause the stream.
- Handle the decompressed chunk.
example
var bytesRead = 500;
var decompressStream = zlib.createGunzip()
.on('data', function (chunk) {
parseHeader(chunk);
decompressStream.pause();
}).on('error', function(err) {
handleGunzipError(err, file, chunk);
});
fs.createReadStream(file.path, {start: 0, end: bytesRead, chunkSize: bytesRead + 1})
.on('data', function (chunk) {
decompressStream.write(chunk);
});
This has been working so far and also allows me to keep handling all other gunzip errors as the pause() prevents the decompress stream from throwing the "unexpected end of file" error.
Solution 2:[2]
I ran into this same issue when trying to end processing of a NodeJS Gzip stream. I used "buffer-peek-stream" to inspect the header of the gzip stream - determining that it is in fact a gzip stream. Then I unwrapped the first few megabytes of the stream - to peek inside that file and determine the mime type of the gzip contents.
This necessitated two calls to zlib.createGunzip()
I found that even if I created what appeared to be two separate instances of the gunzip transform, destroying the second instance caused the first instance to throw this "unexpected end of file" error. Even when the first instance was in a completely different context.
The fix in my case was to call .destroy() on the first instance to clean it up, before creating a second instance.
Solution 3:[3]
I got this error when using node v10.13.0. I upgraded to v10.19.0 and it was fixed.
Solution 4:[4]
Constellates' answer works great, but only if the extractable chunk is smaller than zlib's processing chunk size (16 KB by default). For larger amounts, you need to combine the chunks, for example by concatenating them. Here is a TS example using Promises:
const gunzipped: Buffer = await new Promise((resolve, reject) => {
const buffer_builder: Buffer[] = []
const decompress_stream = zlib.createGunzip()
.on('data', (chunk: Buffer) => {
buffer_builder.push(chunk)
}).on('close', () => {
resolve(Buffer.concat(buffer_builder))
}).on('error', (err) => {
if(err.errno !== -5) // EOF: expected
reject(err)
});
decompress_stream.write(/* ... your gzipped input buffer */)
decompress_stream.end()
})
Solution 5:[5]
I ran into this when piping a net.Socket into a decompression zlib.createInflate() stream.
const net = require('net')
const zlib = require('zlib')
net.connect(port, host)
.pipe(zlib.createInflate())
.on('data', chunk => console.log('chunk', chunk))
When the socket closed, the zlib stream would throw the unexpected end of file error because it was ended with partial state.
I fixed it by setting finishFlush: zlib.constants.Z_SYNC_FLUSH so that it flushes partial data instead of expecting to seal a complete block with Z_FINISH eof.
const net = require('net')
const zlib = require('zlib')
net.connect(port, host)
.pipe(zlib.createInflate({
finishFlush: zlib.constants.Z_SYNC_FLUSH,
}))
.on('data', chunk => console.log('chunk', chunk))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Constellates |
| Solution 2 | Travis Collins |
| Solution 3 | Kevin McNerney |
| Solution 4 | phil294 |
| Solution 5 | danneu |
