'Python how to read binary file by chunks and specify the beginning offset
I have the code:
def read_chunks(infile, chunk_size):
while True:
chunk = infile.read(chunk_size)
if chunk:
yield chunk
else:
return
This works when I need to read the file by chunks; however, sometimes I need to read the file two bytes at a time, but start reading at the next offset, not the next chunk. For example: 00 01 02 03 04, I would need to read "00 01", "01 02", "02 03", "03 04" for a chunk size of 2. The function currently reads it as "00 01", "02 03", "04". Is there a way to implement what I'm trying to do in the same function, or should this just be as a separate function? What would this look like? I still need the function to work as-is, so I'm wondering if there's a way to just implement what I'm trying to do, maybe as an argument. Not sure if it would be better to implement this in the current function or just do that in a separate function.
Solution 1:[1]
Using tell() and seek(n) you can navigate the file pointer wherever you want in file.
tell(): returns the current file position in a file stream
seek(n): sets the current file position to n in a file stream
def read_chunks(infile, chunk_size, offset=0):
if( chunk_size + offset < 1 ):
return
while True:
chunk = infile.read(chunk_size)
if chunk:
yield chunk
if offset != 0:
if not infile.read(1): # eof reached
return
infile.seek(infile.tell()+offset-1)
# -1 to revert read(1)
else:
return
f = open("x", "rb") # read binary
for i in read_chunks(f,2,-1):
print(i,end=" ")
Update:
- Opening file is using "rb" from "read binary" instead of "r"
- EOF check changed
Addition -> offset parameter:
- 0 (default value): reads chunks one after another
- positive n: skips n chunks between reads
- negative n: n previously read bytes are included in start of next chunk
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
