Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doesn't work on 32-bit and big-endian platforms #31

Open
df7cb opened this issue Mar 19, 2024 · 1 comment
Open

Doesn't work on 32-bit and big-endian platforms #31

df7cb opened this issue Mar 19, 2024 · 1 comment

Comments

@df7cb
Copy link
Contributor

df7cb commented Mar 19, 2024

Hi, I'm working on packaging pg_roaringbitmap for apt.postgresql.org and Debian. Naturally, that includes building it for a variety of different machine architectures.

pg_roaringbitmap works on little-endian 64-bit platforms only. If that's intended, that's fine, but maybe we can do better and support the rest as well.

32-bit x86

On 32-bit x86, the problem is actually quite small:

**** regression.diffs ****
--- /<<PKGBUILDDIR>>/expected/roaringbitmap.out	2022-06-28 18:19:36.000000000 +0000
+++ /<<PKGBUILDDIR>>/results/roaringbitmap.out	2024-03-19 14:05:37.720980272 +0000
@@ -815,13 +815,13 @@
 select rb_is_empty('{1}');
  rb_is_empty 
 -------------
- f
+ t
 (1 row)
 
 select rb_is_empty('{1,10,100}');
  rb_is_empty 
 -------------
- f
+ t
 (1 row)
 
 select rb_cardinality(NULL);

The other regression tests pass. (I tried poking around in the source, but only figured that return rb->size == 0; in roaring_buffer_is_empty yields the wrong result, not where the problem is.)

On 32-bit, there, is a warning that is easy to fix:

gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-format-truncation -Wno-stringop-truncation -g -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -g -O2 -Werror=implicit-function-declaration -ffile-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -Wformat -Werror=format-security -fPIC -std=c99 -Wno-error=maybe-uninitialized -Wno-declaration-after-statement -I. -I./ -I/usr/include/postgresql/10/server -I/usr/include/postgresql/internal  -Wdate-time -D_FORTIFY_SOURCE=2 -D_GNU_SOURCE -I/usr/include/libxml2  -I/usr/include/mit-krb5  -c -o roaringbitmap.o roaringbitmap.c
In file included from roaringbitmap.c:1:
roaringbitmap.h: In function ‘pg_aligned_malloc’:
roaringbitmap.h:57:20: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
   57 |     p = (void *)((((uint64)porg + alignment) / alignment) * alignment);
      |                    ^
roaringbitmap.h:57:9: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
   57 |     p = (void *)((((uint64)porg + alignment) / alignment) * alignment);
      |         ^
roaringbitmap.h:58:47: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
   58 |     *((unsigned char *)p-1) = (unsigned char)((uint64)p - (uint64)porg);
      |                                               ^
roaringbitmap.h:58:59: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
   58 |     *((unsigned char *)p-1) = (unsigned char)((uint64)p - (uint64)porg);
      |                                                           ^
roaringbitmap.h: In function ‘pg_aligned_free’:
roaringbitmap.h:66:21: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
   66 |     porg = (void *)((uint64)memblock - *((unsigned char *)memblock-1));
      |                     ^
roaringbitmap.h:66:12: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
   66 |     porg = (void *)((uint64)memblock - *((unsigned char *)memblock-1));
      |            ^
roaringbitmap.h:68:25: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
   68 |         porg = (void *)((uint64)porg - 256);
      |                         ^
roaringbitmap.h:68:16: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
   68 |         porg = (void *)((uint64)porg - 256);
      |                ^

This can be fixed by using size_t instead of uint64. (I'll open a PR for that in a minute.)

big-endian s390x

On big-endian architectures, the problems are much worse:

**** regression.diffs ****
diff -U3 /<<PKGBUILDDIR>>/expected/roaringbitmap.out /<<PKGBUILDDIR>>/results/roaringbitmap.out
--- /<<PKGBUILDDIR>>/expected/roaringbitmap.out	2022-06-28 18:19:36.000000000 +0000
+++ /<<PKGBUILDDIR>>/results/roaringbitmap.out	2024-03-19 14:05:10.319049606 +0000
@@ -57,13 +57,13 @@
 select  '{}'::roaringbitmap;
    roaringbitmap    
 --------------------
- \x3a30000000000000
+ \x0000303a00000000
 (1 row)
 
 select  '{ -1 ,  2  , 555555 ,  -4  }'::roaringbitmap;
                                    roaringbitmap                                    
 ------------------------------------------------------------------------------------
- \x3a300000030000000000000008000000ffff01002000000022000000240000000200237afcffffff
+ \x0000303a000000030000000000080000ffff000100000020000000220000002400027a23fffcffff
 (1 row)
 
 set roaringbitmap.output_format='array';
@@ -74,17 +74,13 @@
 (1 row)
 
 select '\x3a30000000000000'::roaringbitmap;
- roaringbitmap 
----------------
- {}
-(1 row)
-
+ERROR:  bitmap format is error
+LINE 1: select '\x3a30000000000000'::roaringbitmap;
+               ^
 select '\x3a300000030000000000000008000000ffff01002000000022000000240000000200237afcffffff'::roaringbitmap;
-  roaringbitmap   
-------------------
- {2,555555,-4,-1}
-(1 row)
-
+ERROR:  bitmap format is error
+LINE 1: select '\x3a300000030000000000000008000000ffff01002000000022...
+               ^
 -- Exception
 select  ''::roaringbitmap;
 ERROR:  malformed bitmap literal
@@ -142,19 +138,19 @@
 select '{}'::roaringbitmap::bytea;
        bytea        
 --------------------
- \x3a30000000000000
+ \x0000303a00000000
 (1 row)
 
 select '{1}'::roaringbitmap::bytea;
                  bytea                  
 ----------------------------------------
- \x3a3000000100000000000000100000000100
+ \x0000303a0000000100000000000000100001
 (1 row)
 
 select '{1,9999}'::roaringbitmap::bytea;
                    bytea                    
 --------------------------------------------
- \x3a30000001000000000001001000000001000f27
+ \x0000303a0000000100000001000000100001270f
 (1 row)
 
 select '{}'::roaringbitmap::bytea::roaringbitmap;
@@ -1582,1092 +1578,7 @@
 (1 row)
 
 select rb_fill('{}',0,2);
- rb_fill 
----------
- {0,1}
-(1 row)

... and then the server crashes.

(Again, one possible fix here is to just document that big-endian would not be supported.)

Thanks for considering.

@lemire
Copy link

lemire commented Mar 19, 2024

CRoaring supports 32-bit and even big endian systems (see https://github.com/RoaringBitmap/CRoaring/blob/master/.github/workflows/s390x.yml).

See our documentation:

Screenshot 2024-03-19 at 11 32 59 AM

If you need big-endian support, we invite you to contribute to the following issue regarding serialization: RoaringBitmap/CRoaring#423

It is a simple matter of flipping the bytes when reading and writing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants