Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect result when packing unpacking a recarray with padding bytes #287

Open
fyrestone opened this issue Oct 16, 2024 · 3 comments
Open
Labels
bug Something isn't working

Comments

@fyrestone
Copy link

The output data should be correct, however, some weird data are generated.

import blosc2
import numpy as np

print(blosc2.__version__)
print(np.__version__)

dtype = {
    "names": ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l"],
    "formats": [
        "<u8",
        "<i8",
        "<i8",
        "<u8",
        "<i4",
        "<u4",
        "<u4",
        "<i2",
        "i1",
        "i1",
        "i1",
        "<u8",
    ],
    "offsets": [0, 8, 16, 24, 32, 36, 40, 44, 46, 47, 48, 56],
    "itemsize": 64,
    "aligned": True,
}
arr = np.recarray(100, dtype=dtype)
print(type(arr), arr.dtype)
arr2 = blosc2.unpack_tensor(blosc2.pack_tensor(arr))
print(type(arr2), arr2.dtype)

Output

3.0.0b4
1.26.4
<class 'numpy.recarray'> (numpy.record, {'names': ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l'], 'formats': ['<u8', '<i8', '<i8', '<u8', '<i4', '<u4', '<u4', '<i2', 'i1', 'i1', 'i1', '<u8'], 'offsets': [0, 8, 16, 24, 32, 36, 40, 44, 46, 47, 48, 56], 'itemsize': 64, 'aligned': True})
<class 'numpy.ndarray'> [('a', '<u8'), ('b', '<i8'), ('c', '<i8'), ('d', '<u8'), ('e', '<i4'), ('f', '<u4'), ('g', '<u4'), ('h', '<i2'), ('i', 'i1'), ('j', 'i1'), ('k', 'i1'), ('f11', 'V7'), ('l', '<u8')]

You can see an additional column f11 was added, what is it ?

@FrancescAlted FrancescAlted added the bug Something isn't working label Oct 16, 2024
@FrancescAlted
Copy link
Member

Yes, I can reproduce this. If you can find the root of the issue, shout!

@fyrestone
Copy link
Author

I think there are some padding bugs, if I remove these params, then the output is good:

import blosc2
import numpy as np

print(blosc2.__version__)
print(np.__version__)

dtype = {
    "names": ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l"],
    "formats": [
        "<u8",
        "<i8",
        "<i8",
        "<u8",
        "<i4",
        "<u4",
        "<u4",
        "<i2",
        "i1",
        "i1",
        "i1",
        "<u8",
    ],
    # "offsets": [0, 8, 16, 24, 32, 36, 40, 44, 46, 47, 48, 56],
    # "itemsize": 64,
    # "aligned": True,
}
arr = np.recarray(100, dtype=dtype)
print(type(arr), arr.dtype)
arr2 = blosc2.unpack_tensor(blosc2.pack_tensor(arr))
print(type(arr2), arr2.dtype)

@fyrestone
Copy link
Author

The f11 field is the padding hole, shown as ?, I don't know why it becomes a column.

0        8        16       24       32       40       48       56       64
|--------|--------|--------|--------|--------|--------|--------|--------|
|aaaaaaaa|bbbbbbbb|cccccccc|dddddddd|eeeeffff|gggghhij|k???????|llllllll|

@fyrestone fyrestone changed the title Incorrect result when packing unpacking a recarray Incorrect result when packing unpacking a recarray with padding bytes Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants