Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OFD locking broken on 32-bit Linux? #15

Open
bgamari opened this issue Mar 22, 2020 · 12 comments · May be fixed by #32
Open

OFD locking broken on 32-bit Linux? #15

bgamari opened this issue Mar 22, 2020 · 12 comments · May be fixed by #32

Comments

@bgamari
Copy link
Contributor

bgamari commented Mar 22, 2020

I suspect that OFD locking is broken on 32-bit Linux in both lukko and base.

Specifically, we end up calling fcntl64 yet use the 32-bit struct flock. glibc suffered a similar bug some years ago. The following patch works around the issue although I'm still investigating the portability implications:

diff --git a/src-ofd/Lukko/OFD.hsc b/src-ofd/Lukko/OFD.hsc
index 8d04dae..024cb28 100644
--- a/src-ofd/Lukko/OFD.hsc
+++ b/src-ofd/Lukko/OFD.hsc
@@ -127,21 +127,21 @@ data FLock  = FLock { l_type   :: CShort
                     }
 
 instance Storable FLock where
-    sizeOf _ = #{size struct flock}
+    sizeOf _ = #{size struct flock64}
     alignment _ = #{alignmentcompat struct flock}
     poke ptr x = do
         fillBytes ptr 0 (sizeOf x)
-        #{poke struct flock, l_type}   ptr (l_type x)
-        #{poke struct flock, l_whence} ptr (l_whence x)
-        #{poke struct flock, l_start}  ptr (l_start x)
-        #{poke struct flock, l_len}    ptr (l_len x)
-        #{poke struct flock, l_pid}    ptr (l_pid x)
+        #{poke struct flock64, l_type}   ptr (l_type x)
+        #{poke struct flock64, l_whence} ptr (l_whence x)
+        #{poke struct flock64, l_start}  ptr (l_start x)
+        #{poke struct flock64, l_len}    ptr (l_len x)
+        #{poke struct flock64, l_pid}    ptr (l_pid x)
     peek ptr = do
-        x1 <- #{peek struct flock, l_type}   ptr
-        x2 <- #{peek struct flock, l_whence} ptr
-        x3 <- #{peek struct flock, l_start}  ptr
-        x4 <- #{peek struct flock, l_len}    ptr
-        x5 <- #{peek struct flock, l_pid}    ptr
+        x1 <- #{peek struct flock64, l_type}   ptr
+        x2 <- #{peek struct flock64, l_whence} ptr
+        x3 <- #{peek struct flock64, l_start}  ptr
+        x4 <- #{peek struct flock64, l_len}    ptr
+        x5 <- #{peek struct flock64, l_pid}    ptr
         return (FLock x1 x2 x3 x4 x5)
 
 lockImpl :: Maybe Handle -> FD -> String -> LockMode -> Bool -> IO Bool
@phadej
Copy link
Collaborator

phadej commented Mar 22, 2020

Is there a way to test this, e.g. some Docker image? Or is it unreliable, if you don't really run 32bit kernel?

@phadej
Copy link
Collaborator

phadej commented Mar 22, 2020

FYI, GHC also has the same in libraries/base/GHC/IO/Handle/Lock/LinuxOFD.hsc

instance Storable FLock where
    sizeOf _ = #{size struct flock}
    alignment _ = #{alignment struct flock}
    poke ptr x = do
        fillBytes ptr 0 (sizeOf x)
        #{poke struct flock, l_type}   ptr (l_type x)
        #{poke struct flock, l_whence} ptr (l_whence x)
        #{poke struct flock, l_start}  ptr (l_start x)
        #{poke struct flock, l_len}    ptr (l_len x)
        #{poke struct flock, l_pid}    ptr (l_pid x)

EDIT: I'm tempted to wait until GHC has the patch (or shows it is not required). I think it still has way to run by not using cabal-install which uses lukko. ATM there aren't any released cabal-install which uses it. (Except versions in HVR PPA).

@bgamari
Copy link
Contributor Author

bgamari commented Mar 22, 2020

Actually, there are no portability implications by definition; OFD locking is Linux specific and fcntl64 has been long supported.

@bgamari
Copy link
Contributor Author

bgamari commented Mar 22, 2020

FYI, GHC also has the same in libraries/base/GHC/IO/Handle/Lock/LinuxOFD.hsc

Yes, I know. Hence the mention of base in the ticket description.

Is there a way to test this, e.g. some Docker image? Or is it unreliable, if you don't really run 32bit kernel?

Yes, see the GHC ticket.

@phadej
Copy link
Collaborator

phadej commented Mar 22, 2020

What that image is based on. It's manifest says file:594fa35cf803361e69d817fc867b6a4069c064ffd20ed50caf42ad9bb11ca999 in /

I assume it's some Ubuntu variant, but I'm not running it on my machine without knowing for sure.

@phadej
Copy link
Collaborator

phadej commented Mar 22, 2020

The glibc bug seems to be fixed in 2.28 but on Ubuntu-18.04 it is

% dpkg -l |grep libc-bin
ii  libc-bin                                   2.27-3ubuntu1 ...

Can someone try on Ubuntu Eoan, it uses 2.30 https://packages.ubuntu.com/eoan/libc-bin, if it's so, then we can just use the glibc patch indeed.


To clarify: the fix for older glibc's shouldn't break newer ones.

Look at this diff: https://sourceware.org/git/?p=glibc.git;a=blobdiff;f=sysdeps/unix/sysv/linux/fcntl.c;h=cbab7b401a3875babb545681ba9128d850883f67;hp=e3992dc9d4d1e20ef8aa97e81ec1bd13ac170900;hb=06ab719d30b01da401150068054d3b8ea93dd12f;hpb=124e025864bb39732c71fc60c1443d5680881a0a

If we change Storage FLock as proposed, AFAICS we will break stuff too.

@phadej
Copy link
Collaborator

phadej commented Mar 22, 2020

As the workaround for now: compile cabal with lukko -ofd-locking on 32-bit linuxes with old glibc.

@bgamari
Copy link
Contributor Author

bgamari commented Mar 22, 2020

My thought here is that we should probably just use fcntl64 and struct flock64 explicitly.

@newhoggy
Copy link

newhoggy commented Mar 24, 2020

If you're interested, I have i386 build support for CircleCI projects in our haskell-build CircleCI orb.

leahneukirchen added a commit to leahneukirchen/void-packages that referenced this issue Jul 28, 2021
leahneukirchen added a commit to leahneukirchen/void-packages that referenced this issue Jul 28, 2021
leahneukirchen added a commit to leahneukirchen/void-packages that referenced this issue Jul 28, 2021
leahneukirchen added a commit to leahneukirchen/void-packages that referenced this issue Jul 28, 2021
leahneukirchen added a commit to void-linux/void-packages that referenced this issue Jul 28, 2021
@leftaroundabout
Copy link

leftaroundabout commented Nov 5, 2021

Problem persists on Debian 11, libc-bin version 2.13-13+deb11u2 i386.

Still needs -ofd-locking to avoid the fdLock invalid argument issue.

If no proper fix to this is on the horizon, how about default-setting -ofd-locking on i386? What are the drawbacks?

alyssais added a commit to alyssais/lukko that referenced this issue Dec 29, 2022
Previously, OFD locking was broken on 32-bit, because it would somehow
end up using the 64-bit fcntl() function, but pass it 32-bit off_t
values in struct flock.  To fix this, tell libc to use always use
64-bit file offsets.

This is the same fix that was applied for a similar issue in GHC[1].

[1]: https://gitlab.haskell.org/ghc/ghc/-/commit/9853fc5e3556e733b56976b0a2fce9e82130a9ef

Fixes: haskellari#15
@alyssais alyssais linked a pull request Dec 29, 2022 that will close this issue
@phadej
Copy link
Collaborator

phadej commented Jun 5, 2024

@leftaroundabout how did you test it. The #38 doesn't fail.

I cannot reproduce the failure myself, so I cannot verify the fix either.

@leftaroundabout
Copy link

@phadej I don't even remember what this issue was about or why it mattered to me, sorry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants