'Advanced slicing when passed list instead of tuple in numpy
In the docs, it says (emphasis mine):
Advanced indexing is triggered when the selection object, obj, is a non-tuple sequence object, an ndarray (of data type integer or bool), or a tuple with at least one sequence object or ndarray (of data type integer or bool). There are two types of advanced indexing: integer and Boolean.
<snip>
Also recognize that
x[[1,2,3]]will trigger advanced indexing, whereasx[[1,2,slice(None)]]will trigger basic slicing.
I know why x[(1, 2, slice(None))] triggers basic slicing. But why does x[[1,2,slice(None)]] trigger basic slicing, when [1,2,slice(None)] meets the condition of being a non-tuple sequence?
On a related note, why does the following occur?
>>> a = np.eye(4)
>>> a[(1, 2)] # basic indexing, as expected
0.0
>>> a[(1, np.array(2))] # basic indexing, as expected
0.0
>>> a[[1, 2]] # advanced indexing, as expected
array([[ 0., 1., 0., 0.],
[ 0., 0., 1., 0.]])
>>> a[[1, np.array(2)]] # basic indexing!!??
0.0
Solution 1:[1]
There's an exception to that rule. The Advanced Indexing documentation section doesn't mention it, but up above, near the start of the Basic Slicing and Indexing section, you'll see the following text:
In order to remain backward compatible with a common usage in Numeric, basic slicing is also initiated if the selection object is any non-ndarray sequence (such as a list) containing slice objects, the Ellipsis object, or the newaxis object, but not for integer arrays or other embedded sequences.
a[[1, np.array(2)]] doesn't quite trigger basic indexing. It triggers an undocumented part of the backward compatibility logic, as described in a comment in the source code:
/*
* Sequences < NPY_MAXDIMS with any slice objects
* or newaxis, Ellipsis or other arrays or sequences
* embedded, are considered equivalent to an indexing
* tuple. (`a[[[1,2], [3,4]]] == a[[1,2], [3,4]]`)
*/
The np.array(2) inside the list causes the list to be treated as if it were a tuple, but the result, a[(1, np.array(2))], is still an advanced indexing operation. It ends up applying the 1 and the 2 to separate axes, unlike a[[1, 2]], and the result ends up looking identical to a[1, 2], but if you try it with a 3D a, it produces a copy instead of a view.
Solution 2:[2]
With a dummy class I can determine how the interpreter translates [...] into calls to __getitem__.
In [1073]: class Foo():
...: def __getitem__(idx):
...: print(idx)
In [1080]: Foo()[1,2,slice(None)]
(1, 2, slice(None, None, None))
In [1081]: Foo()[(1,2,slice(None))]
(1, 2, slice(None, None, None))
In [1082]: Foo()[[1,2,slice(None)]]
[1, 2, slice(None, None, None)]
So wrapping multiple terms with () makes no difference - it gets a tuple in both cases. And a list is passed as a list.
So the distinction between tuple and list (or not) must coded in numpy source code - which is compiled. So I can't readily study it.
With a 1d array
indexing with a list produces the advanced indexing - picking specific values:
In [1085]: arr[[1,2,3]]
Out[1085]: array([ 0.73703368, 0. , 0. ])
but replacing one of those values with a tuple, or a slice:
In [1086]: arr[[1,2,(2,3)]]
IndexError: too many indices for array
In [1088]: arr[[1,2,slice(None)]]
IndexError: too many indices for array
and the list is treated as a tuple - it tries matching values with dimensions.
So at a top level a list and tuple are treated the same - if the list can't interpreted as an advanced indexing list.
Notice also a difference which single item lists
In [1089]: arr[[1]]
Out[1089]: array([ 0.73703368])
In [1090]: arr[(1,)]
Out[1090]: 0.73703367969998546
In [1091]: arr[1]
Out[1091]: 0.73703367969998546
Some functions like np.apply_along/over_axis generate an index as list or array, and then apply it. They work with a list or array because it is mutable. Some then wrap it in tuple before use as index; others didn't bother. That difference sort of bothered me, but these test case indicate that such a tuple wrapped often is optional.
In [1092]: idx=[1,2,slice(None)]
In [1093]: np.ones((2,3,4))[idx]
Out[1093]: array([ 1., 1., 1., 1.])
In [1094]: np.ones((2,3,4))[tuple(idx)]
Out[1094]: array([ 1., 1., 1., 1.])
Looks like the tuple wrapper is still needed if I build the index as an object array:
In [1096]: np.ones((2,3,4))[np.array(idx)]
...
IndexError: arrays used as indices must be of integer (or boolean) type
In [1097]: np.ones((2,3,4))[tuple(np.array(idx))]
Out[1097]: array([ 1., 1., 1., 1.])
===================
Comment from the function @Eric linked
/*
* Sequences < NPY_MAXDIMS with any slice objects
* or newaxis, Ellipsis or other arrays or sequences
* embedded, are considered equivalent to an indexing
* tuple. (`a[[[1,2], [3,4]]] == a[[1,2], [3,4]]`)
*/
===================
This function wraps object arrays and lists in tuple for indexing:
def apply_along_axis(func1d, axis, arr, *args, **kwargs):
....
ind = [0]*(nd-1)
i = zeros(nd, 'O')
....
res = func1d(arr[tuple(i.tolist())], *args, **kwargs)
outarr[tuple(ind)] = res
update
Now this list indexing produces a FutureWarning:
In [113]: arr.shape
Out[113]: (2, 3, 4)
In [114]: arr[[1, 2, slice(None)]]
<ipython-input-114-f30c20184e42>:1: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
arr[[1, 2, slice(None)]]
Out[114]: array([20, 21, 22, 23])
The changing the list to a tuple produces the same thing, without the warning:
In [115]: arr[(1, 2, slice(None))]
Out[115]: array([20, 21, 22, 23])
this is same thing as:
In [116]: arr[1, 2, :]
Out[116]: array([20, 21, 22, 23])
Indexing with commas creates a tuple which is passed to the __setitem__ method.
The Warning says that in the future it will try to turn the list into an array instead of a tuple:
In [117]: arr[np.array([1, 2, slice(None)])]
Traceback (most recent call last):
Input In [117] in <module>
arr[np.array([1, 2, slice(None)])]
IndexError: arrays used as indices must be of integer (or boolean) type
But with the slice object this raises an error. In that sense the arr[tuple([....])] interpretation is the only thing that makes sense. But it's a legacy case, left over from an earlier numeric package.
Fortunately it's unlikely that a novice programmer will try this. They may try arr[[1,2,:]], but that will give a syntax error. : is only allowed in indexing brackets, not in list brackets (or tuple () either).
This current round of comments was triggered by a differ case that produce the FutureWarning:
In [123]: arr[[[0, 1], [1, 0]]]
<ipython-input-123-4fa43c8569dd>:1: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
arr[[[0, 1], [1, 0]]]
Out[123]:
array([[ 4, 5, 6, 7],
[12, 13, 14, 15]])
Here the nested list interpreted as a tuple with lists, or even:
In [124]: arr[[0, 1], [1, 0]]
Out[124]:
array([[ 4, 5, 6, 7],
[12, 13, 14, 15]])
In [126]: arr[np.array([[0, 1], [1, 0]])].shape
Out[126]: (2, 2, 3, 4)
Same warning, but it isn't quite as obvious why the legacy code chose to take the tuple interpretation. I don't see it documented.
Solution 3:[3]
So this is my conclusion:
- The
[1,2]apparently is a 1d-list. And in this case, the advanced indexing is triggered. Soa[[1,2]]has the same result asa[[1,2],]. - The
[1, np.array(2)]is (treated as) a 2d-list, even thoughnp.array(2)is zero dimension. Soa[[1, np.array(2)]]has the same result asa[tuple([1, np.array(2)])]and thusa[1, 2], which gives the result0.0.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | |
| Solution 3 |
