-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: implied Boolean range assertions for local prop #109481
JIT: implied Boolean range assertions for local prop #109481
Conversation
If morph sees an assignment of 0/1 to a local, have it generate a [0..1] range assertion in addition to the constant assertion. This helps morph propagate more Boolean ranges at merges, which allows morph to elide more casts. Also, bump up the local assertion table size since morph is now generating more assertions.
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
Split off from #109469. Seems beneficial on its own. Let's see how costly it is.. |
Surprisingly costly... let's try smaller BVs. |
Might need to do some actual data modelling to get a better idea on how to rightsize these bit vectors. The old formula I created used We have metrics tracking (roughly) how many assertions we drop because of table size limits (rough because once the table is full, repeated attempts to add say |
Here's some data from ASP.NET, with the assertion table size set to 4096. About 82000 methods. The current table sizing loses about 29000 assertions (with the change in this PR, which create more assertions than usual). There is a decent linear trend of roughly 1.17 assertions per tracked local, but that is too conservative for many small methods. The number of tracked locals is fairly predictably about 0.76 of the total number of locals so either measure can be used to build a sizing heuristic. Adding sizing and a trend line to the original chart for the new heuristic (orange below), it does a good job of not missing too many assertions (about 3.5K get dropped). And (not shown) it is almost always picking bigger vectors that current main (bigger or same for 79.5K, smaller 2.5K methods) and perhaps not surprisingly about 32 bits longer on average. Unfortunately pin is broken on my box so I can't dig in with inst count and see if it's the extra BV width or the extra assertion construction cost or the extra morph actions that is the TP culprit. Zooming in to the smaller methods (less than 200 tracked locals) On 64 bit hosts the cost of the BV gets rounded up to the nearest mutiple of 64, so those 96 bit vectors are actually 128 bits. So assuming we're looking at BV cost as the predominant TP impact, we can claw some back for the small cases (which are overwhelmingly the frequent cases) by limiting the BV size to 64. If we do this for cases with less than 24 live locals (see extreme detail below) we only lose assertions in a handful of methods: |
I wonder how it would affect things if we used a dynamically expansible (up to some maximum) bit vector implementation instead... Say use the lowest bit to indicate whether this is a pointer to a (length, words) tuple or just inline 63 (or 31) bits. |
Phoenix had a fairly sophisticated sparse BV setup, maybe we can just steal it. I need to profile this to see if the BV size is the culprit -- will try one of my other boxes. |
Looks like BV cost is not the dominant factor
Instead it's the cost of making the new assertions and then later on filtering through them. I think we can improve the filtering part a little. Will do that separately as it should be no diff without the other changes here. |
Local TP of this PR plus #110091 vs baseline (win x64 libraries tests no TC) shows we come out ahead:
though that perhaps had outsized benefits from #110091 |
New diffs Worst-case TP is about 0.3 which was paid for by #110091 SPMI failures are unrelated, likely something to do with the recent ISA additions.... opened #110106
@EgorBo PTAL |
} | ||
} | ||
} | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AndyAyersMS I am a bit confused - if a local already have lcl = 1
(O1K_LCLVAR OAK_EQUAL O2K_CONST_INT) assertion, why does it need a separate OAK_SUBRANGE
one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AP merge strategy is very simple, it just intersects bit vectors. So the merge of X=1
from one path and X=[0,1]
from another is the empty set, since those are different assertions, instead of the more reasonable X=[0,1]
.
A tricker case might be X=0
on one path and X=1
on another.
Adding merge smarts to AP likely would slow things down quite a bit (though we could try it as an alternative). But by adding a partially redundant X=[0,1]
whenever we have X=0
or X=1
things just work out naturally, and the number of extra assertions isn't too bad.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see, makes sense then
Thought this might fix #110019, but the problem there is that we lose the fact that the call to Seems like we could perhaps annotate such (managed) calls as really returning bools, and have AP generate a suitable subtype assertion... |
Yeah I've tried that once: I added a |
Isn't that unsafe for arbitrary calls since non 0/1 bools are legal in managed code? |
If the jit is generating the code it will guarantee the result is 0/1. For interop it depends on whether it's going via an IL stub or not. |
/ba-g SPMI failures |
I don't think we do any sort of normalization like this. E.g. something like |
We can at least do that for some [Intrinsic] including the one in question - SequenceEqual |
It also might be enough for us to assert the result is [0..255]. |
If morph sees an assignment of 0/1 to a local, have it generate a [0..1] range assertion in addition to the constant assertion.
This helps morph propagate more Boolean ranges at merges, which allows morph to elide more casts.
Also, bump up the local assertion table size since morph is now generating more assertions.