'How to use the buffer_callback in pickle?
I'm trying to understand how to use the buffer_callback argument in pickle.dumps(). I've read the official python doc and I can follow the example given. But when I try to extend it a bit and do the following:
a = np.array([1, 2, 3, 4])
b = np.array([5, 6, 7, 8])
buffers = []
a_bytes = pickle.dumps(a, protocol=5, buffer_callback=buffers.append)
b_bytes = pickle.dumps(b, protocol=5, buffer_callback=buffers.append)
b_loaded = pickle.loads(b_bytes, buffers=buffers)
I thought b_loaded should be [5, 6, 7, 8] (and the same object as b), but instead I got [1, 2, 3, 4], which I can verify is actually a. Does it mean I need a separate buffers for each dumps call? or am I missing something?
Solution 1:[1]
Yes, you need separate lists of buffers. The two sets of pickle/unpickle operations are independent. Combining their out of band data can only cause confusion. What are you trying to accomplish by doing so?
In this case, each call to pickle.dumps stores one out of band buffer to buffers. When pickle.loads runs, it knows it needs one out of band buffer, so it pulls that buffer from buffers. It has no reason to know that to restore the second pickle, it should skip one buffer and use the second one in buffers. Why should it? The buffers in buffers are not labeled and are not otherwise distinguishable. Why would you expect anything else?
You could do this:
b_loaded = pickle.loads(b_bytes, buffers=buffers[1:])
but that's kinda silly. The two pickle/unpickle operations are independent. Just use two lists of buffers.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
