Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Full NumPy style indexing #119

Open
Ivorforce opened this issue Oct 8, 2024 · 2 comments
Open

Full NumPy style indexing #119

Ivorforce opened this issue Oct 8, 2024 · 2 comments
Labels
feature New feature or request

Comments

@Ivorforce
Copy link
Owner

NumPy only really has one kind of indexing (which I previously misunderstood), though I'm pretty sure they have some kind of optimization going on for simple indexes. This is how it works:

First, the index (single object) is interpreted as a tuple:

  • If the index is a tuple, it is used as a list of index elements (also the case for returns of nonzero)
  • Otherwise, it is used as a single index element (even if it is a list, only tuples count as multiple indexers)

Then, the indexers can be

def process_index_element(array, index_element, from_dimension):
    pass

forward_dimension = 0

indexer_iter = index_elements.iter
for indexer in indexer_iter:
    if index_elements[forward_dimension] == ellipsis:
        backwards_dimension = array.ndim
        for indexer in indexer_iter:
            if index_elements[forward_dimension] == ellipsis:
                 raise
            array = process_index_element(array, index_element, backwards_dimension)

        return array

    array = process_index_element(array, index_element, forward_dimension)

return array

Elements act as such:

  • : leaves the dimension untouched
  • None inserts a new size 1 dimension
  • ranges change the size of the current dimension
    • (deep) lists of ranges produce as many new dimensions as the array is deep (i.e. inserted shape is index_list.shape)
  • single integers consume a dimension (indexing into it at from_dimension)
    • (deep) lists of integers consume a dimension, but produce as many new dimensions as the array is deep. (i.e. inserted shape is index_list.shape)
  • single booleans add a dimension, size 1 if true and size 0 if false
    • (deep) lists of booleans add a dimension, but consume as many new dimensions as the array is deep. Each dimension in the index element must be the same shape as the dimension at from_dimension + dimension

As seen above, an ellipsis reverses the traversal process (and if dimensions are consumed, the current position automatically moves leftwards to match).

All in all, since indexes are rarely more than 2 or 3, this iterative process of single-index is probably (!) fast enough for most use-cases. An optimization could be applied by first checking if one of the xtensor accelerated indexers can be created from the array:

  • empty: no-slice (i.e. use original array)
  • non-array only: produce and use xstrided_slice_vector
  • single boolean array: use as xt::masked_view
  • only depth 1 integer arrays: use as xt::index_view
@Ivorforce Ivorforce added the feature New feature or request label Oct 8, 2024
@Ivorforce
Copy link
Owner Author

The main problem will be updates: While non-array slices produce views, any array slices produce copies. For the first version, we can simply disallow non-accelerated list accesses (since they're rare), but we'll have to figure out that problem at some point.

@Ivorforce
Copy link
Owner Author

Also, we have a problem because unlike NumPy, we cannot use tuples to identify nonzero returns. If we want these to work as naturally as in NumPy, we may have to do some black magic somewhere :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant