-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARM64-SVE: Ensure MOVPRFX is next to SVE instruction in imm tables #106125
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
@dotnet/arm64-contrib @kunalspathak |
@@ -736,12 +783,10 @@ void CodeGen::genHWIntrinsic(GenTreeHWIntrinsic* node) | |||
|
|||
default: | |||
assert(targetReg != embMaskOp2Reg); | |||
GetEmitter()->emitIns_R_R_R(INS_sve_movprfx, emitSize, targetReg, maskReg, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was emitted the predicated movprfx
but now we will emit unpredicated one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unfortunately we do not have superpmi-diff for SVE tests, but should be easy to do that to check the disasm diffs.
- python src\coreclr\scripts\superpmi.py collect corerun hardwareintrinsic.dll
- superpmi.py asmdiff abc.mch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was emitted the predicated
movprfx
but now we will emit unpredicated one.
Fixed
should be easy to do that to check the disasm diffs.
I'll do this. Sadly there will be lots of diffs because the movprfx has moved, but I'll take a look
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that should be only in the immediate taking instructions. the goal would be to make sure we did not regress anything else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ran all the Sve tests. Only diffs I can see are the placing of movprfx and additional table lookup calculations.
} | ||
else if (targetReg != embMaskOp1Reg) | ||
{ | ||
// embMaskOp1Reg is same as `falseReg`, but not same as `targetReg`. Move the | ||
// `embMaskOp1Reg` i.e. `falseReg` in `targetReg`, using "unpredicated movprfx", so the | ||
// subsequent `insEmbMask` operation can be merged on top of it. | ||
GetEmitter()->emitIns_R_R(INS_sve_movprfx, EA_SCALABLE, targetReg, falseReg); | ||
emitInsMovPrfxHelper(targetReg, maskReg, falseReg, embMaskOp2Reg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here. this was earlier generating unpredicated version, but now is generated predicated one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
emitInsMovPrfxHelper()
generates an unpredicated movprfx:
GetEmitter()->emitIns_R_R(INS_sve_movprfx, EA_SCALABLE, reg1, reg3);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However.... looks like HEAD is wrong:
else
{
// If the instruction just has "predicated" version, then move the "embMaskOp1Reg"
// into targetReg. Next, do the predicated operation on the targetReg and last,
// use "sel" to select the active lanes based on mask, and set inactive lanes
// to falseReg.
assert(targetReg != embMaskOp2Reg);
assert(HWIntrinsicInfo::IsEmbeddedMaskedOperation(intrinEmbMask.id));
GetEmitter()->emitIns_R_R(INS_sve_movprfx, EA_SCALABLE, targetReg, embMaskOp1Reg);
emitInsHelper(targetReg, maskReg, embMaskOp2Reg);
}
That is generating a unpredicated movprfx followed by a predicated instruction.
(This was spotted by my newest changes in #106184)
Change-Id: I26358b9d0e19fff508d29ce0cd2a70a9d0539b88
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
looks like sve leg is failing to build the tests:
|
Change-Id: I159a0b72cc53a6b2f0b75418036a8ad5a1a9146d
I was seeing this on another PR with the same HEAD. Merged to latest and it appears to have vanished. |
That failure is dotnet/dnceng#3304. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
As per the Arm Arm, MOVPRFX instructions must be followed by an SVE instruction of specific type,
otherwise behaviour is undefined.
For immediate lookup tables MOVPRFX instructions were being placed before the table instead of
next to the SVE instruction. Instead, the movprfx needs moving inside each case next to the
relevant SVE instruction.
System.Runtime.Intrinsics.Arm.Sve:ShiftRightArithmeticForDivide before:
System.Runtime.Intrinsics.Arm.Sve:ShiftRightArithmeticForDivide after: