Full NumPy style indexing #119

Ivorforce · 2024-10-08T09:57:25Z

NumPy only really has one kind of indexing (which I previously misunderstood), though I'm pretty sure they have some kind of optimization going on for simple indexes. This is how it works:

First, the index (single object) is interpreted as a tuple:

If the index is a tuple, it is used as a list of index elements (also the case for returns of nonzero)
Otherwise, it is used as a single index element (even if it is a list, only tuples count as multiple indexers)

Then, the indexers can be

def process_index_element(array, index_element, from_dimension):
    pass

forward_dimension = 0

indexer_iter = index_elements.iter
for indexer in indexer_iter:
    if index_elements[forward_dimension] == ellipsis:
        backwards_dimension = array.ndim
        for indexer in indexer_iter:
            if index_elements[forward_dimension] == ellipsis:
                 raise
            array = process_index_element(array, index_element, backwards_dimension)

        return array

    array = process_index_element(array, index_element, forward_dimension)

return array

Elements act as such:

: leaves the dimension untouched
None inserts a new size 1 dimension
ranges change the size of the current dimension
- (deep) lists of ranges produce as many new dimensions as the array is deep (i.e. inserted shape is index_list.shape)
single integers consume a dimension (indexing into it at from_dimension)
- (deep) lists of integers consume a dimension, but produce as many new dimensions as the array is deep. (i.e. inserted shape is index_list.shape)
single booleans add a dimension, size 1 if true and size 0 if false
- (deep) lists of booleans add a dimension, but consume as many new dimensions as the array is deep. Each dimension in the index element must be the same shape as the dimension at from_dimension + dimension

As seen above, an ellipsis reverses the traversal process (and if dimensions are consumed, the current position automatically moves leftwards to match).

All in all, since indexes are rarely more than 2 or 3, this iterative process of single-index is probably (!) fast enough for most use-cases. An optimization could be applied by first checking if one of the xtensor accelerated indexers can be created from the array:

empty: no-slice (i.e. use original array)
non-array only: produce and use xstrided_slice_vector
single boolean array: use as xt::masked_view
only depth 1 integer arrays: use as xt::index_view

The text was updated successfully, but these errors were encountered:

Ivorforce · 2024-10-08T09:59:52Z

The main problem will be updates: While non-array slices produce views, any array slices produce copies. For the first version, we can simply disallow non-accelerated list accesses (since they're rare), but we'll have to figure out that problem at some point.

Ivorforce · 2024-10-08T10:01:08Z

Also, we have a problem because unlike NumPy, we cannot use tuples to identify nonzero returns. If we want these to work as naturally as in NumPy, we may have to do some black magic somewhere :(

Ivorforce added the feature New feature or request label Oct 8, 2024

Ivorforce mentioned this issue Nov 7, 2024

Drops and Keeps for slices #16

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Full NumPy style indexing #119

Full NumPy style indexing #119

Ivorforce commented Oct 8, 2024

Ivorforce commented Oct 8, 2024

Ivorforce commented Oct 8, 2024

Full NumPy style indexing #119

Full NumPy style indexing #119

Comments

Ivorforce commented Oct 8, 2024

Ivorforce commented Oct 8, 2024

Ivorforce commented Oct 8, 2024