Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement CSR hierarchy #19

Open
JohanMabille opened this issue Apr 8, 2020 · 7 comments
Open

Implement CSR hierarchy #19

JohanMabille opened this issue Apr 8, 2020 · 7 comments

Comments

@JohanMabille
Copy link
Member

The CSR hierarchy should implement sparse tensors (N-D) with CSR format. This should follow the same pattern as xmap_container classes:

  • xcsr_container is a CRTP base class that gathers common implementation and API of inheriting classes
  • xcsr_array inherits from xcsr_container and implements the case where the number of dimensions is dynamic
  • xcsr_tensor inherits from xcsr_container and implements the case where the number of dimensions is static
@tdegeus
Copy link
Member

tdegeus commented Apr 9, 2020

I'm ready to take this on. However, I need some explanation of the current code structure

@gouarin
Copy link
Member

gouarin commented Apr 9, 2020

@tdegeus as @JohanMabille said the xcsr_container implements all the common methods which means the size, dimension, ... and the reshape, resize and access operator.

xcsr_array and xcsr_tensor define the storage and how to get access. You have to define xtensor_inner_types to describe the types used for both of them.

So I think it's a copy-paste of xmap_container with the access operator end the `update_entries modified.

@JohanMabille adds the xsparse_reference to be able to return 0 if the index doesn't exist in the matrix. We can modify this value and if we assign this value to the same index, then we create an entry. In this way, you don't have to create a zero-entry each time you try to access an index.

@gouarin
Copy link
Member

gouarin commented Apr 10, 2020

@tdegeus I don't know if you had time to make progress in the implementation of CSR format. I wrote a data structure to be able to manage the COO and CSR (and CSF for ND) at the same time. You can see the code here

https://gist.github.com/gouarin/f5f3952ff494b326e77e1af01e3219c3

Finally, it will be easier to split the implementation for COO (where pos vector doesn't exist) and for CSR/CSF (when pos vector exists) and to implement find_element, inseert_element and remove_element for each of them.

It's just an idea. Feel free to comment or add your own implementation.

@tdegeus
Copy link
Member

tdegeus commented Apr 11, 2020

@gouarin Nice. Can you explain your implementation a bit? In particular, what are pos and coord?

@tdegeus
Copy link
Member

tdegeus commented Apr 11, 2020

A different point of discussion is that I'm a bit worried about just created an n-d array. I think there should maybe be a dedicated 2-d matrix as it would be easier to interface with solvers if the storage is classically in three contiguous arrays.

Also because many of the storage strategies are there to do efficient linear algebra, much more than saving on storage.

In more broader sense I am having difficulties understanding the current implementation, it would help if you could explain a bit, before I start 'reinventing the wheel'. E.g. what should be the relation between a specific implementation (e.g. CSR) and xmap_container?

@gouarin
Copy link
Member

gouarin commented Apr 11, 2020

The pos and coord arrays have exactly the same rule given in this article mentioned by @JohanMabille in this issue #21. And the examples given in the main function are those that you find in the article for the 2d and 3d cases.

If I resume, the COO format only needs the size of the coordinate arrays for each dimension and loop over this coord array to find if the entry exists. For the CSR or CSF format you need an extra array called pos which gives you for an element i what are the segment in coord corresponding to this element. It's generally called row_ptr for CSR and the array col is the coord array. But this data structure is easily recursive and can manage N-D tensors.

These two implementations should work with third-party software since you have the right information. For example, for the CSR: pos[1] = row_ptr and coord[1]=col.

The only relation between CSR and xmap_conatiner is the way to manage the shape and the strides to update the index entries when you make a reshape. But they don't have the same storage to store the index in each dimension and the non zero values.

Hope that I help you to better understand my implementation.

@JohanMabille
Copy link
Member Author

I think there should maybe be a dedicated 2-d matrix as it would be easier to interface with solvers if the storage is classically in three contiguous arrays.

If the solvers work with the storage itself rather than with APIs of the storage, then 2-D matrix should indeed be a specialization of the N-D case so it's easy to interface it with existing solvers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants