'How do I recursively uncompress an archieve of multiple compression types?
I have what is initially a zip file which contains an unknown number of compressions. The files sometimes have the compression type in the file extension (myfile.gz, myfile.tar, myfile.zip) and sometimes don't (myfile), this will also prompt file rename/override prompts when manually unzipping, but I've been able to successfully unzip with a bash code:
#!/bin/bash
function extract(){
unzip $1 -d ${1/.zip/} && eval $2 && cd ${1/.zip/}
for zip in `find . -maxdepth 1 -iname *.zip`; do
extract $zip 'rm $1'
done
}
extract '1.zip'
But I'm running into an issue with 1) the file names not all having an ending like .zip and 2) the files being of different compression types. I'm thinking to run a file check with file myfile to then programmatically switch which decompression type is required 3) I need to count the number of times I needed to decompress to get to the final file. Any ideas on how to solve this?
Solution 1:[1]
Remember file types are determined not by the extension name but from something called the magic number which is a number at the very beginning of the file that indicates what you're dealing with. Check here or here
Having that in mind, you'd like to try file command, specifically with --mime-type switch.
In general, file command will check the file and from the magic number will tell you what kind of file you're dealing with. The --mime-type switch will give you the mime type of the file, which I believe will be easier for you to handle within a script.
As an example:
- Create some empty files:
$ touch hi0 hi2
$ ls -s hi?
0 hi0 0 hi2
- file will confirm they are all empty files:
$ file hi?
hi0: empty
hi2: empty
$ file --mime-type hi?
hi0: inode/x-empty
hi2: inode/x-empty
- With hexdump command we'll see again they're empty
$ hexdump hi0 hi2
$
- Compress them
$ gzip hi0 ; bzip2 hi2
$ ls -s hi*
4 hi0.gz 4 hi2.bz2
- By means of hexdump command you'll see the magic number of them
$ hexdump hi0.gz
0000000 8b1f 0808 2d60 6263 0300 6968 0030 0003
0000010 0000 0000 0000 0000
0000018
$ hexdump hi2.bz2
0000000 5a42 3968 7217 3845 9050 0000 0000
000000e
There you have it!
Regards
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Alfredo Campos EnrĂquez |
