'Advanced slicing when passed list instead of tuple in numpy

In the docs, it says (emphasis mine):

Advanced indexing is triggered when the selection object, obj, is a non-tuple sequence object, an ndarray (of data type integer or bool), or a tuple with at least one sequence object or ndarray (of data type integer or bool). There are two types of advanced indexing: integer and Boolean.

<snip>

Also recognize that x[[1,2,3]] will trigger advanced indexing, whereas x[[1,2,slice(None)]] will trigger basic slicing.

I know why x[(1, 2, slice(None))] triggers basic slicing. But why does x[[1,2,slice(None)]] trigger basic slicing, when [1,2,slice(None)] meets the condition of being a non-tuple sequence?


On a related note, why does the following occur?

>>> a = np.eye(4)
>>> a[(1, 2)]  # basic indexing, as expected
0.0
>>> a[(1, np.array(2))] # basic indexing, as expected
0.0

>>> a[[1, 2]]  # advanced indexing, as expected
array([[ 0.,  1.,  0.,  0.],
   [ 0.,  0.,  1.,  0.]])
>>> a[[1, np.array(2)]]  # basic indexing!!??
0.0


Solution 1:[1]

There's an exception to that rule. The Advanced Indexing documentation section doesn't mention it, but up above, near the start of the Basic Slicing and Indexing section, you'll see the following text:

In order to remain backward compatible with a common usage in Numeric, basic slicing is also initiated if the selection object is any non-ndarray sequence (such as a list) containing slice objects, the Ellipsis object, or the newaxis object, but not for integer arrays or other embedded sequences.


a[[1, np.array(2)]] doesn't quite trigger basic indexing. It triggers an undocumented part of the backward compatibility logic, as described in a comment in the source code:

    /*
     * Sequences < NPY_MAXDIMS with any slice objects
     * or newaxis, Ellipsis or other arrays or sequences
     * embedded, are considered equivalent to an indexing
     * tuple. (`a[[[1,2], [3,4]]] == a[[1,2], [3,4]]`)
     */

The np.array(2) inside the list causes the list to be treated as if it were a tuple, but the result, a[(1, np.array(2))], is still an advanced indexing operation. It ends up applying the 1 and the 2 to separate axes, unlike a[[1, 2]], and the result ends up looking identical to a[1, 2], but if you try it with a 3D a, it produces a copy instead of a view.

Solution 2:[2]

With a dummy class I can determine how the interpreter translates [...] into calls to __getitem__.

In [1073]: class Foo():
      ...:     def __getitem__(idx):
      ...:         print(idx)
In [1080]: Foo()[1,2,slice(None)]
(1, 2, slice(None, None, None))
In [1081]: Foo()[(1,2,slice(None))]
(1, 2, slice(None, None, None))
In [1082]: Foo()[[1,2,slice(None)]]
[1, 2, slice(None, None, None)]

So wrapping multiple terms with () makes no difference - it gets a tuple in both cases. And a list is passed as a list.

So the distinction between tuple and list (or not) must coded in numpy source code - which is compiled. So I can't readily study it.

With a 1d array

indexing with a list produces the advanced indexing - picking specific values:

In [1085]: arr[[1,2,3]]
Out[1085]: array([ 0.73703368,  0.        ,  0.        ])

but replacing one of those values with a tuple, or a slice:

In [1086]: arr[[1,2,(2,3)]]
IndexError: too many indices for array

In [1088]: arr[[1,2,slice(None)]] 
IndexError: too many indices for array

and the list is treated as a tuple - it tries matching values with dimensions.

So at a top level a list and tuple are treated the same - if the list can't interpreted as an advanced indexing list.

Notice also a difference which single item lists

In [1089]: arr[[1]]
Out[1089]: array([ 0.73703368])
In [1090]: arr[(1,)]
Out[1090]: 0.73703367969998546
In [1091]: arr[1]
Out[1091]: 0.73703367969998546

Some functions like np.apply_along/over_axis generate an index as list or array, and then apply it. They work with a list or array because it is mutable. Some then wrap it in tuple before use as index; others didn't bother. That difference sort of bothered me, but these test case indicate that such a tuple wrapped often is optional.

In [1092]: idx=[1,2,slice(None)]
In [1093]: np.ones((2,3,4))[idx]
Out[1093]: array([ 1.,  1.,  1.,  1.])
In [1094]: np.ones((2,3,4))[tuple(idx)]
Out[1094]: array([ 1.,  1.,  1.,  1.])

Looks like the tuple wrapper is still needed if I build the index as an object array:

In [1096]: np.ones((2,3,4))[np.array(idx)]
...
IndexError: arrays used as indices must be of integer (or boolean) type
In [1097]: np.ones((2,3,4))[tuple(np.array(idx))]
Out[1097]: array([ 1.,  1.,  1.,  1.])

===================

Comment from the function @Eric linked

    /*
     * Sequences < NPY_MAXDIMS with any slice objects
     * or newaxis, Ellipsis or other arrays or sequences
     * embedded, are considered equivalent to an indexing
     * tuple. (`a[[[1,2], [3,4]]] == a[[1,2], [3,4]]`)
     */

===================

This function wraps object arrays and lists in tuple for indexing:

def apply_along_axis(func1d, axis, arr, *args, **kwargs):
     ....
     ind = [0]*(nd-1)
     i = zeros(nd, 'O')
     ....
     res = func1d(arr[tuple(i.tolist())], *args, **kwargs)
     outarr[tuple(ind)] = res

update

Now this list indexing produces a FutureWarning:

In [113]: arr.shape
Out[113]: (2, 3, 4)
In [114]: arr[[1, 2, slice(None)]]
<ipython-input-114-f30c20184e42>:1: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  arr[[1, 2, slice(None)]]
Out[114]: array([20, 21, 22, 23])

The changing the list to a tuple produces the same thing, without the warning:

In [115]: arr[(1, 2, slice(None))]
Out[115]: array([20, 21, 22, 23])

this is same thing as:

In [116]: arr[1, 2, :]
Out[116]: array([20, 21, 22, 23])

Indexing with commas creates a tuple which is passed to the __setitem__ method.

The Warning says that in the future it will try to turn the list into an array instead of a tuple:

In [117]: arr[np.array([1, 2, slice(None)])]
Traceback (most recent call last):
  Input In [117] in <module>
    arr[np.array([1, 2, slice(None)])]
IndexError: arrays used as indices must be of integer (or boolean) type

But with the slice object this raises an error. In that sense the arr[tuple([....])] interpretation is the only thing that makes sense. But it's a legacy case, left over from an earlier numeric package.

Fortunately it's unlikely that a novice programmer will try this. They may try arr[[1,2,:]], but that will give a syntax error. : is only allowed in indexing brackets, not in list brackets (or tuple () either).

This current round of comments was triggered by a differ case that produce the FutureWarning:

In [123]: arr[[[0, 1], [1, 0]]]
<ipython-input-123-4fa43c8569dd>:1: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  arr[[[0, 1], [1, 0]]]
Out[123]: 
array([[ 4,  5,  6,  7],
       [12, 13, 14, 15]])

Here the nested list interpreted as a tuple with lists, or even:

In [124]: arr[[0, 1], [1, 0]]
Out[124]: 
array([[ 4,  5,  6,  7],
       [12, 13, 14, 15]])
In [126]: arr[np.array([[0, 1], [1, 0]])].shape
Out[126]: (2, 2, 3, 4)

Same warning, but it isn't quite as obvious why the legacy code chose to take the tuple interpretation. I don't see it documented.

Solution 3:[3]

So this is my conclusion:

  1. The [1,2] apparently is a 1d-list. And in this case, the advanced indexing is triggered. So a[[1,2]] has the same result as a[[1,2],].
  2. The [1, np.array(2)] is (treated as) a 2d-list, even though np.array(2) is zero dimension. So a[[1, np.array(2)]] has the same result as a[tuple([1, np.array(2)])] and thus a[1, 2], which gives the result 0.0.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2
Solution 3