'Windows command to split a binary file
I would like to split a binary file into smaller chunks. Anyone knows a Windows command for that?
Because of Android's UNCOMPRESS_DATA_MAX constraint, I cannot overwrite the Database with a file 1MB or larger. So if there is a better way to do it I am OK with that too.
Solution 1:[1]
Method 1:
makecab can split a binary file into smaller encoded chunks in it's own format, but they can't be treated as just raw bytes, similar to a flat binary file eg. via copy. The chunks, however can then be joined by extrac32.
First make a ddf (text) file:
.Set CabinetNameTemplate=test_*.cab; <-- Enter chunk name format
.Set MaxDiskSize=900000; <-- Enter file split/chunk size
.Set ClusterSize=1000
.Set Cabinet=on;
.Set Compress=off;
.set CompressionType=LZX;
.set CompressionMemory=21
.Set DiskDirectoryTemplate=;
file.in
Then:
rem Optional: set compression on to save disk space
makecab /f ddf.txt
To get the original file back, ensure all chunks are in the same directory:
REM join by calling 1st file in the sequence
extrac32 test_1.cab file.out
MakeCAB introduces the concept of a folder to refer to a contiguous set of compressed bytes.
"MakeCAB takes all of the files in the product or application being compressed, lays the bytes down as one continuous byte stream, compresses the entire stream, chopping it up into folders as appropriate, and then fills up one or more cabinets with the folders."
Method 2: For raw byte chunks, powershell can split files:
set size=1000000
set file=test.mp3
for %j in (%file%) do (
set /a chunks=%~zj/%size% >nul
for /l %i in (0,1,!chunks!) do (
set /a tail=%~zj-%i*%size% >nul
powershell gc %file% -Encoding byte -Tail !tail! ^| sc %file%_%i -Encoding byte
if %i lss !chunks! FSUTIL file seteof %file%_%i %size% >nul
)
)
Method 3: via certutil & CMD:
set file="x.7z" &REM compressed to generate CRLF pairs
set max=70000000 &REM certutil has max file limit around 74MB
REM Findstr line limit 8k
REM Workaround: wrap in some archive to generate CRLF pairs
for %i in (%file%) do (
set /a num=%~zi/%max% >nul &REM No. of chunks
set /a last=%~zi%%max% >nul &REM size of last chunk
if %last%==0 set /a num=num-1 &REM ove zero byte chunk
set size=%~zi
)
ren %file% %file%.0
for /l %i in (1 1 %num%) do (
set /a s1=%i*%max% >nul
set /a s2="(%i+1)*%max%" >nul
set /a prev=%i-1 >nul
echo Writing %file%.%i
type %file%.!prev! | (
(for /l %j in (1 1 %max%) do pause)>nul& findstr "^"> %file%.%i)
FSUTIL file seteof %file%.!prev! %max% >nul
)
if not %last%==0 FSUTIL file seteof %file%.%num% %last% >nul
echo Done.
Notes:
- Chunks can be joined by
copy /b - Filename extensions can be made neater by padding chunk numbers
- Can be looped to split entire directories
See example output below:
Directory of C:\Users\Stax\Desktop\Parking
03/05/2022 01:04 <DIR> .
03/05/2022 01:04 <DIR> ..
03/05/2022 01:04 407 Court Notice.pdf.000
03/05/2022 01:04 4,000 Court Notice.pdf.001
03/05/2022 01:04 4,000 Court Notice.pdf.002
03/05/2022 01:04 557 Parking fine.pdf.000
03/05/2022 01:04 4,000 Parking fine.pdf.001
03/05/2022 01:04 4,000 Parking fine.pdf.002
03/05/2022 01:04 4,000 Parking fine.pdf.003
03/05/2022 01:04 4,000 Parking fine.pdf.004
8 File(s) 24,964 bytes
Methods 2 & 3 can then be joined by copy
Tested on Win 10
Solution 2:[2]
You can also install GnuWin from http://gnuwin32.sourceforge.net
For my work, I need to extract some lines from a big Oracle export's file DataBase.bak.
This file is a binary file that is a mix of text's lines and binary lines.
To extract all lines between 2 specifics lines, I only enter following to command
split -l 4114807 database.bak from.
split -l 10357 from.a to.
copy to.a database.RANGE.bak
The first command extract all lines from 4114807 lines to end of file in from.B file.
I have "found" FROM linenumber in loading Database.Bak file in Notepad++.
Caution: the line's number displayed in Notepad++ is not equal to l parameter used in split command because Notepad++ line's number are generated by LF and also CR characters !
I use the second command to extract all lines until a specific TO lines.
I copy first extracted file (to.A file) to a new Database.RANGE.bak file that contains needed extraction.
When job is done, I delete all from.* and to.* files from current directory.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | schlebe |
