'Reinterpreting NumPy arrays as a different dtype

Say I have a large NumPy array of dtype int32

import numpy as np
N = 1000  # (large) number of elements
a = np.random.randint(0, 100, N, dtype=np.int32)

but now I want the data to be uint32. I could do

b = a.astype(np.uint32)

or even

b = a.astype(np.uint32, copy=False)

but in both cases b is a copy of a, whereas I want to simply reinterpret the data in a as being uint32, as to not duplicate the memory. Similarly, using np.asarray() does not help.

What does work is

a.dtpye = np.uint32

which simply changes the dtype without altering the data at all. Here's a striking example:

import numpy as np
a = np.array([-1, 0, 1, 2], dtype=np.int32)
print(a)
a.dtype = np.uint32
print(a)  # shows "overflow", which is what I want

My questions are about the solution of simply overwriting the dtype of the array:

  1. Is this legitimate? Can you point me to where this feature is documented?
  2. Does it in fact leave the data of the array untouched, i.e. no duplication of the data?
  3. What if I want two arrays a and b sharing the same data, but view it as different dtypes? I've found the following to work, but again I'm concerned if this is really OK to do:
    import numpy as np
    a = np.array([0, 1, 2, 3], dtype=np.int32)
    b = a.view(np.uint32)
    print(a)  # [0  1  2  3]
    print(b)  # [0  1  2  3]
    a[0] = -1
    print(a)  # [-1  1  2  3]
    print(b)  # [4294967295  1  2  3]
    
    Though this seems to work, I find it weird that the underlying data of the two arrays does not seem to be located the same place in memory:
    print(a.data)
    print(b.data)
    
    Actually, it seems that the above gives different results each time it is run, so I don't understand what's going on there at all.
  4. This can be extended to other dtypes, the most extreme of which is probably mixing 32 and 64 bit floats:
    import numpy as np
    a = np.array([0, 1, 2, np.pi], dtype=np.float32)
    b = a.view(np.float64)
    print(a)  # [0.  1.  2.  3.1415927]
    print(b)  # [0.0078125  50.12387848]
    b[0] = 8
    print(a)  # [0.  2.5  2.  3.1415927]
    print(b)  # [8.  50.12387848]
    
    Again, is this condoned, if the obtained behaviour is really what I'm after?


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source