-
-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
store extra information in iterator needed to find base of allocation #190
base: master
Are you sure you want to change the base?
Conversation
When the user intends to use an explicit, relocating garbage collector in conjunction with the champ type, it becomes necessary to be able to adjust the pointers inside a champ_iterator to point to the new address of the relocated collection node that is currently on the iterator's path. In order to do this, it is necessary to be able to compute the base pointer of the allocation for the node containing the buffer of node pointers into which that particular path entry is pointing. However, this is not currently deducible from the information stored inside the iterator itself. Here, at the cost of an additional max_depth<B> bytes (which amounts to 16 bytes by default including padding) per iterator created, we maintain that information in the form of an integer offset for each nonzero entry in the path. The garbage collector can then take each node** in the path and subtract the corresponding offset from it using pointer arithmetic in order to obtain the address corresponding to the start of the children buffer, from which the base of the allocation is a known constant offset. In order to save space, we assume that each node can have at most 256 children. This seems to me to be a safe assumption, because the B value that is a template parameter appears to trigger a static assertion if it is ever set greater than 6. By the same logic, we also allocate an additional 4 bytes in order to perform the same set of steps with the pointers to the values. By default, on 64-bit machines, these four bytes live inside what was previously padding between `depth_` and `path_`, thus they do not actually affect the overall size of the iterator. This may not be true on 32-bit machines, however.
From what I can tell, this doesn't actually affect the coverage of the code at all; the only reason it shows the coverage going down is that there are some basic blocks that weren't covered before that have gotten a little bigger now. Are there any specific tests you'd like to see me write here? |
Codecov Report
@@ Coverage Diff @@
## master #190 +/- ##
==========================================
+ Coverage 91.27% 91.30% +0.02%
==========================================
Files 104 105 +1
Lines 9574 9614 +40
==========================================
+ Hits 8739 8778 +39
- Misses 835 836 +1
Continue to review full report at Codecov.
|
Hi @dwightguth! This is actually very interesting work. I find it quite exciting to see the library used for the runtime of such a language, and it was one of the main use-cases highlighted in the original paper. One thing of concern about this PR though is that it adds some overhead to iterators that is not needed in most common scenarios. Ideally, I would like to provide an API between the data-structure and the But maybe there is a smarter way, were the data-structures interacts directly with the copying collector somehow via a custom API exposed through the memory policy. For that I would need to sit down and think about it further, and also maybe learn more about the implementation of your copying GC. But maybe you also have ideas here... |
Well, if your main concern is eliminating overhead when the user doesn't care about this additional information, that's going to be difficult to completely eliminate unless everything to do with deciding which behavior to use happens at compile time, like you said. Originally I considered putting the changes to the data structure layout under a preprocessor definition in config.hpp, but decided after I got everything implemented that a couple extra memory reads and writes per iterator access, plus 16 additional bytes in a previously 136-byte type weren't, in the end, that significant. If you feel it's justified, I could just put that code back in place. There is some precedent for using preprocessor defines for this already, for example with It might be tricky to get it to a point where the code doesn't have any duplication in this approach, though. I should think you might still have some overhead in the form of a function call to a function that does nothing, but at least that call could be removed by the optimizer if you're building in release mode. I'm willing to give either a try, so just let me know which approach you favor and I'll try and get started on it on my next workday. |
@arximboldi did you have any preference for or against either of the solutions I proposed here? |
Hi @dwightguth! My apologies for the very delayed response. I think my preferred approach is to solve it via templates. This may be a bit more work on top of your current solution, but I think it makes things a bit more clearer. I would try as much as possible to avoid code duplication: please do not just specialize the whole
I hope this helps and best of luck with the integration! |
This commit refactors the previous implementation of this PR so that both the memory and compute overhead of tracking relocation information in the iterator is only performed if a specific memory_policy is enabled. We create a new relocation_policy policy and track two different implementations: no_relocation_policy, which is the default in all cases, and gc_relocation_policy, which should be enabled if the user is using the library with a relocating garbage collector. Because so much information about how to track relocation information is type-specific, the policy acts merely as a marker, and each type that needs to track relocation information is responsible for defining a template that does nothing and has no members if relocation is disabled, and otherwise stores the necessary information and maintains it if that information is required. We then instrument the class that needs to track data by inserting calls to methods on the relocation information template, which will be responsible for updating the information as necessary.
Looks like your CI is no longer working. We had the same problem with nix a couple days back. It seems to be an incompatibility between the latest version of the nix GitHub Actions and the 2.4 release of Nix that happened a couple days ago. Here is the temporary workaround for the issue that we went with in our repositories: runtimeverification/llvm-backend@6105287 I am working on the fix you proposed; still testing it. But I wanted to give you this information since it seems to be the reason all the CI checks are failing. I did run |
I've tested this code now with both |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! This is really going in the right direction. Thanks for all the effort into integrating your changes upstream.
Besides what I have commented inline, however, the only thing left bothering me is that I do not really understand how your code is actually using this. There is no client in the codebase for all this new code at all, which makes it quite esoteric and also prone to be broken again in the future. The code is not even part of the public interface of the iterator. Also it leaves me wondering if there is a simpler way to do some of this maybe.
So! My suggestion would be:
- Write some unit tests so there is actually some client in the code-base that uses this, there is a better understanding of what this does, and lower chances of` breaking this in the future.
- Can you point me to links in your code-base that uses this?
I have btw also pushed the fix you suggested for the Nix issues, thank you very much for the workaround! |
This is the code that makes use of this functionality: https://github.com/kframework/llvm-backend/blob/immer_update/runtime/collect/migrate_collection.cpp Essentially, it needs to be able to take each object on the heap containing a collection or iterator, and fix up all the pointers stored in those objects to point to the relocated versions of those objects. There are a couple places where we end up relying on some hacky behavior like const_cast or casting a class to a struct with the same layout, but it functions, which is my primary goal. I intend to rely on friend classes for the unit tests. |
After giving careful thought, I decided that the relocation information of a class probably ought to be public since the garbage collector is expected to need to be able to read this information.
This test is slightly fragile in that it probably will not work if the depth of the tree is uneven across different nodes. However, in order to make it completely robust irrespective of the platform-specific details of the hash function and the number of elements in the set being tested, it would be necessary to re-implement a lot of the functionality for walking the tree in the test, which seems counterproductive to me as we would end up just testing the business logic against itself rather than against a flat set of expectations. So intead, I ended up just implementing a partial check for the relocation information which tests the major behavior of the step_right and step_down functions in the iterator, rather than testing the entire set exhaustively. This allows the expected results of the test to be specified declaratively, which makes the test more appropriate in terms of what functionality it is actually testing.
Looks like this is still failing CI due to the SSH key issue, but it is now ready for review again. |
Sorry, I forgot to merge #193 which fixes the ssh key issue. |
@arximboldi This is still failing due to the missing SSH key even though #193 is now merged in. Are you sure you correctly solved the issue? |
Hmmm, ok, I guess my fix was wrong :) |
Also: I did take a look at your latest changes and at the code that uses these changes. I see how your migration traversal really relies on the internals of the library. I am wondering, if maybe the traversal itself should somehow be made part of the library itself. What do you think? Basically have a special funcion |
I mean, there's room for maybe putting some of the traversal stuff in our code into the library, I agree. That doesn't really affect this particular change though because the code that uses this data actually is not performing a traversal of the data structure. all it's doing is updating every pointer within a single iterator object. I'm not opposed, but I'd rather not confuse different concepts by putting something like that into this PR. With that being said, we could maybe try to provide a similar interface for fixing up pointers within a single object that abstracts away some of the internal implementation details. The reason I'm hesitant though is because it might not actually end up extracting away that much information at all unless we encode details of the garbage collector into the library code, which would essentially make the function useless for anyone except me. Right now the code does kind of rely on knowledge of the internals of the library, it's true, but that doesn't really bother me much because we pin ourselves to a specific version and don't update it very frequently at all. |
Yes, I understand your rationale. The thing is also the new |
Well, it's certainly the case that the relocation info struct is specifically designed for use by a relocating garbage collector, but any garbage collector which moves objects will need this information in some form in order to be able to relocate iterators. So there's a certain amount of generality there. My concern with making an explicit function for fixing up pointers is more that while any garbage collector that needs to move objects will need that information, how it uses that information to perform its task is somewhat dependent on other details of the gc implementation. For example, the garbage collector I wrote needs to be able to find the base of the allocation from any pointer to the root of a champ, but how it does this is partially dependent on the implementation of the champ and partially dependent on the implementation of the struct containing the champ that actually forms the allocation. It's possible this could still be worked around in order to come up with a fully generic mechanism for providing this information, but it's definitely going to prove finicky. Letting the garbage collector choose how to use the information it needs seems like a simpler solution to me. But if you want me to try to dive into trying to encapsulate the information a bit more, I can give it a try and let you know how it goes. I just don't know for sure that it's going to prove possible. |
Sorry if this wasn't clear, but my suggestion was to add a traversal function to the library, that visits every pointer allocated in the data-structure, passing it as a reference so the callback can modify it. The actual relocation would still be part of your library. The idea would be to basically expose the same information (where all the nodes allocated by the data structure are, and how to mutate them for relocation) but in a simple interface that basically is just one single function, and does not require the caller to know where to go and find the actual pointers inside the data structure (which is what your client code does). Does this make sense? |
I mean, that really doesn't work, because a garbage collection algorithm depends on visiting pointers in a certain order that is probably not known to immer. |
I see. |
@arximboldi I am returning to this after our company's winter break. What about the following interface to iterator instead of what I have in this pr?
This function would essentially inline the logic currently found here, in order to implement the following logic:
This would be an operation on an iterator that would be used to fix up the pointers inside a single iterator during a garbage collection cycle, without needing to have the garbage collector understand the internal details of the iterator structure or how that structure chooses to store information about the root of allocations. Would this satisfy you? |
@arximboldi have you had a chance to take a look at the proposed encapsulated design I made? I am now focused on a different project, but it would be nice to not have to use a fork of your project. If you're not interested in merging this change though, we can live with a fork; it would be nice to get confirmation if that is the case though so that I can take this off my list of open tasks. |
Hi @dwightguth! I'm very sorry for the delayed reply. Reviewing this work requires a fair amount of brain power and is also a bit of an edge-case, so it always ends up getting to the bottom of my todo list. But I really appreciate your insistence in getting this through, as I really agree with you that it would be best if your use-case did not require a fork of the library! I have a couple of considerations:
Once again thank you for your contribution and sorry for the delayed reviews! I'll try stay in the loop to get this |
To be clear, ideally your code would not use You mentioned before that the algorithm needs to traverse the pointers in certain order that is not known to Immer, but the pointers themselves are allocated by Immer, how can your algorithm have special information about the order of the pointers? If the information is something as simple as "deepth first" or "breath first", that is something that can be specified at an API level, isn't it? |
Thanks for the response! No worries at all about the fact that this is taking some time, I'm just happy we're making progress. I'm going to try to address your comments one at a time, and then get started on creating a PR with the proposed (modified) design, because I think we're close to converging.
As far as the question of a relocate method for the collection itself... You may be right that this is possible. I would have to give some more thought to how this could be achieved, and ideally that would be a separate pull request. But I'll do some thinking about it once I have finalized this part of the design and created an updated PR. That being said, the particular question you are asking about with respect to the order that the gc processes objects... The gc algorithm our own personal gc is based on is called Cheney's algorithm, and it has the property that it can operate in linear time based on the amount of still-live memory, because it relocates objects in a particular order: namely, it first relocates all garbage collection roots, adding them to a queue, then it loops through the queue updating the pointers inside each object and adding any objects referenced by that object to the end of the queue if it hasn't seen them yet. This loop is part of the core gc algorithm. As such, while it may be possible to relocate an entire collection at once, hiding the internal details, simply by doing a breadth first traversal of the trie, this would certainly be a disruption to the logic of the algorithm, especially because it would need to add objects that are /not/ allocated by immer to the queue as it proceeds (namely, the collection elements). Our algorithm makes some compromises here in terms of relocating the entire collection as a group, but this is definitely non-standard and I wouldn't necessarily expect other relocating garbage collectors to do so. So it may be possible? I'm willing to give it a try and see what it does to the code base, after we got this PR in. But I'm not entirely sure how reusable the result would actually end up being. It may be that if we want a truly reusable solution, it would have to be over the classes in the implementation rather than the interface. Let me know what your thoughts are here. Regardless, I will try and push an update to this PR soon. |
What I see in your code is that you are already implementing a special case traversal for Immer data-structures. What I am trying to figure out is if the fundamentals of this traversal could be abstracted out into a general mechanism in Immer, with a bit of modern C++ in there. I feel like a similar interface to what you has suggested for Anyway, happy to review your next batch of changes! |
Yes, if we focus on specifically replicating the type of traversal used in our codebase, I imagine a function signature very like the one I mentioned above could be effectively used to centralize the details of traversing the data structure into the immer library itself. If that's something you're interested in, I'm happy to try to sketch up a PR for it when I have time after this change gets merged. I guess what I'm saying though is that such a solution may not be particularly useful for any other relocating gc algorithms that come along and try to use this library. But I can definitely try to create some code later if you still think it's worth it. |
Ok @dwightguth, After all this discussion, I think I'm happy to attempt a merge of this and the other open PRs. I think it would be good to make your code independent of implementation details by providing traversal or relocation algorithms as discussed within the library. Good not only for this, but also for the longevity of your code (implementation details may change in the future and make your code incompatible with upgrades of the library...). So, if you want: I will merge this PR, and follow up with a small PR from me where I cleanup a couple of things (rename some of the new data structures, add some comments so people can reach you for maintenance of the relocation info, etc.). In exchange, you will make over the next few weeks a new PR where you sketch a generic |
I would like to at least try to sketch an update to this PR to encapsulate the relocation information in the iterator before getting this merged. I just don't know exactly when I will have time to return to it. |
Awesome. Thanks! |
No rush from my side. |
When the user intends to use an explicit, relocating garbage collector
in conjunction with the champ type, it becomes necessary to be able to
adjust the pointers inside a champ_iterator to point to the new address
of the relocated collection node that is currently on the iterator's
path. In order to do this, it is necessary to be able to compute the
base pointer of the allocation for the node containing the buffer of
node pointers into which that particular path entry is pointing.
However, this is not currently deducible from the information stored
inside the iterator itself.
Here, at the cost of an additional
max_depth<B>
bytes (which amounts to16 bytes by default including padding) per iterator created, we maintain
that information in the form of an integer offset for each nonzero entry
in the path. The garbage collector can then take each node** in the path
and subtract the corresponding offset from it using pointer arithmetic
in order to obtain the address corresponding to the start of the
children buffer, from which the base of the allocation is a known
constant offset.
In order to save space, we assume that each node can have at most 256
children. This seems to me to be a safe assumption, because the B value
that is a template parameter appears to trigger a static assertion if it
is ever set greater than 6.
By the same logic, we also allocate an additional 4 bytes in order to
perform the same set of steps with the pointers to the values. By
default, on 64-bit machines, these four bytes live inside what was
previously padding between
depth_
andpath_
, thus they do notactually affect the overall size of the iterator. This may not be true
on 32-bit machines, however.