Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a space leak in list traversal #27

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

treeowl
Copy link

@treeowl treeowl commented Dec 26, 2022

Previously, we used zip to define itraverse for lists. This led to two problems:

  1. Because the zip fused with the index generator, it could not fuse with the argument.

  2. I ran into situations where the zip didn't fuse with the index generator, so my code ended up actually building and saving [0..] as a CAF. That's a nasty space leak, as well as slow.

Writing itraverse for lists using foldr directly seems to clear up these issues. Unboxing the counter manually should prevent Int boxes from being allocated when the passed function doesn't need them.

itraverseListStarting :: forall f a b. Applicative f => Int -> (Int -> a -> f b) -> [a] -> f [b]
itraverseListStarting (I# i0) f = \xs -> foldr go stop xs i0
where
go x r !i = liftA2 (:) (f (I# i) x) (r (i +# 1#))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be possible to do without relying on more unsafe (Safe Haskell wise) modules. I won't import GHC.Exts.

Copy link
Collaborator

@phadej phadej Dec 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And apparently this is one additional instance where -ffull-lazyness is a pessimisation.

I won't accept this kind of patch without a reference to a GHC issue, which IMO this is. The [0..] in itraverse implementation shouldn't be memoized and GHC should have a means to express that, or it should not use -ffull-lazyness by default.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phadej Please propose an alternative to GHC.Exts that I can use.

Copy link
Collaborator

@phadej phadej Dec 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we unbox the counter by hand? GHC /can/ do that itself, but for some
reason it only happens with @-O2@, and we use the standard @-O1@.

Report a GHC issue. We should know why we write code the way we do. It might be a GHC bug, or it might be an inherent limitation (i.e. GHC cannot do better). We should know the reason.

Copy link
Collaborator

@phadej phadej Dec 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, we used `zip` to define `itraverse` for lists. This led to
two problems:

1. Because the zip fused with the index generator, it could *not* fuse
   with the argument.

2. I ran into situations where the zip *didn't* fuse with the index
   generator, so my code ended up actually building *and saving* `[0..]`
   as a CAF. That's a nasty space leak, as well as slow.

Writing `itraverse` for lists using `foldr` directly seems to clear up
these issues. Unboxing the counter manually should prevent `Int` boxes
from being allocated when the passed function doesn't need them.
Copy link
Collaborator

@phadej phadej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To summarize:

  • I'm not a fan of needing to unbox Int manually. I'd like to understand what makes -O2 do it for us.

  • FunctorWithIndex and FoldableWithIndex instances are not using zip [0..] pattern, and it makes sense to be uniform. The imap and ifoldr are not written using foldr, so either:

    • They should be rewritten with foldr as well
    • Or itraverse can be written using manual recursion too (and probably GHC will be smart enough to unbox Int then). This is "worse" because list fusion won't fire, but I actually don't care, as list fusion is unreliably anyway. (And e.g. RULES dance around GHC.Base.map is too complicated).

    So I lean towards the latter option: writing itraverse for lists with explicit recursion.

    The itraverseListStarting helper is good, as it's useful in NonEmpty instance as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants