Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UNREACHABLE Impossible reg-to-reg copy #92

Open
dylanmckay opened this issue Feb 20, 2018 · 59 comments
Open

UNREACHABLE Impossible reg-to-reg copy #92

dylanmckay opened this issue Feb 20, 2018 · 59 comments
Assignees
Labels
A-libcore Affects compiling the core library A-llvm Affects the LLVM AVR backend has-local-patch A patch exists but has not been applied to upstream LLVM has-reduced-testcase A small LLVM IR file exists that demonstrates the problem

Comments

@dylanmckay
Copy link
Member

dylanmckay commented Feb 20, 2018

This bug occurs in my LLVM 6.0 branch at #91 when I compile stock libcore.

Impossible reg-to-reg copy
UNREACHABLE executed at /home/dylan/projects/llvm-project/llvm/lib/Target/AVR/AVRInstrInfo.cpp:75!
#0 0x00007fd02d6f8cc6 llvm::sys::PrintStackTrace(llvm::raw_ostream&) /home/dylan/projects/llvm-project/llvm/lib/Support/Unix/Signals.inc:398:11
#1 0x00007fd02d6f8ec9 PrintStackTraceSignalHandler(void*) /home/dylan/projects/llvm-project/llvm/lib/Support/Unix/Signals.inc:462:1
#2 0x00007fd02d6f7260 llvm::sys::RunSignalHandlers() /home/dylan/projects/llvm-project/llvm/lib/Support/Signals.cpp:0:5
#3 0x00007fd02d6f92ba SignalHandler(int) /home/dylan/projects/llvm-project/llvm/lib/Support/Unix/Signals.inc:0:3
#4 0x00007fd02c5d6dd0 __restore_rt (/usr/lib/libpthread.so.0+0x11dd0)
#5 0x00007fd02b958860 __GI_raise (/usr/lib/libc.so.6+0x34860)
#6 0x00007fd02b959ec9 __GI_abort (/usr/lib/libc.so.6+0x35ec9)
#7 0x00007fd02d5f71f0 llvm::install_out_of_memory_new_handler() /home/dylan/projects/llvm-project/llvm/lib/Support/ErrorHandling.cpp:193:0
#8 0x00007fd032637599 llvm::AVRInstrInfo::copyPhysReg(llvm::MachineBasicBlock&, llvm::MachineInstrBundleIterator<llvm::MachineInstr, false>, llvm::DebugLoc const&, unsigned int, unsigned int, bool) const /home/dylan/projects/llvm-project/llvm/lib/Target/AVR/AVRInstrInfo.cpp:0:7
#9 0x00007fd03004dd3b (anonymous namespace)::ExpandPostRA::LowerCopy(llvm::MachineInstr*) /home/dylan/projects/llvm-project/llvm/lib/CodeGen/ExpandPostRAPseudos.cpp:166:7
#10 0x00007fd03004d189 (anonymous namespace)::ExpandPostRA::runOnMachineFunction(llvm::MachineFunction&) /home/dylan/projects/llvm-project/llvm/lib/CodeGen/ExpandPostRAPseudos.cpp:212:23
#11 0x00007fd030230e61 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /home/dylan/projects/llvm-project/llvm/lib/CodeGen/MachineFunctionPass.cpp:62:8
#12 0x00007fd02f76fdff llvm::FPPassManager::runOnFunction(llvm::Function&) /home/dylan/projects/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1520:27
#13 0x00007fd02f770152 llvm::FPPassManager::runOnModule(llvm::Module&) /home/dylan/projects/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1541:16
#14 0x00007fd02f7709b0 (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&) /home/dylan/projects/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1597:27
#15 0x00007fd02f770436 llvm::legacy::PassManagerImpl::run(llvm::Module&) /home/dylan/projects/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1700:16
#16 0x00007fd02f770f21 llvm::legacy::PassManager::run(llvm::Module&) /home/dylan/projects/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1731:3
#17 0x00005579e90cf34a compileModule(char**, llvm::LLVMContext&) /home/dylan/projects/llvm-project/llvm/tools/llc/llc.cpp:577:41
#18 0x00005579e90cd800 main /home/dylan/projects/llvm-project/llvm/tools/llc/llc.cpp:347:13
#19 0x00007fd02b944f4a __libc_start_main (/usr/lib/libc.so.6+0x20f4a)
#20 0x00005579e90cceea _start (./bin/llc+0x22eea)
Stack dump:
0.	Program arguments: ./bin/llc -march=avr -mcpu=atmega328p bugpoint-reduced-simplified.ll -o /dev/null 
1.	Running pass 'Function Pass Manager' on module 'bugpoint-reduced-simplified.ll'.
2. Running pass 'Post-RA pseudo instruction expansion pass' on function '@_ZN3lib3num7flt2dec8strategy5grisu22max_pow10_no_more_than17h83b3c65a945b5158E'

Note that this is coming from @_ZN3lib3num7flt2dec8strategy5grisu22max_pow10_no_more_than17h83b3c65a945b5158E - the Grisu float-string algorithm.

I have minimised down to a testcase, and also collected llc debug outputs

target triple = "avr-unknown-unknown"
target datalayout = "e-p:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8"

define { i8, i32 } @chapstick() {
start:
  ret { i8, i32 } zeroinitializer
}

You can see the build artifacts here.

@dylanmckay dylanmckay added the A-libcore Affects compiling the core library label Feb 20, 2018
@shepmaster
Copy link
Member

shepmaster commented Feb 20, 2018

lib::num::flt2dec::strategy

In our own libcore, we've disabled a lot of this code because it's simply too big. This may not be an extra blocker from moving to LLVM 6.

dylanmckay added a commit to dylanmckay/libcore that referenced this issue Feb 22, 2018
…odules

These do not compile because of avr-rust/rust-legacy-fork#92.

The error message looks like

```
Impossible reg-to-reg copy
UNREACHABLE executed at /home/dylan/projects/llvm-project/llvm/lib/Target/AVR/AVRInstrInfo.cpp:75!
Stack dump:
0.	Program arguments: ./bin/llc -march=avr -mcpu=atmega328p bugpoint-reduced-simplified.ll -o /dev/null
1.	Running pass 'Function Pass Manager' on module 'bugpoint-reduced-simplified.ll'.
2. Running pass 'Post-RA pseudo instruction expansion pass' on function '@_ZN3lib3num7flt2dec8strategy5grisu22max_pow10_no_more_than17h83b3c65a945b5158E'
```

avr-rust/rust-legacy-fork#92
@brainlag
Copy link

brainlag commented Mar 12, 2018

Do I understand this code correct. It tries to return a struct with the fields initialized to zero?

And if this is the machine code:

bb.0.start:
  %2:ld8 = LDIRdK 0
  %3:dldregs = LDIWRdK 0
  $r21r20 = COPY %2
  $r23r22 = COPY %3
  $r24 = COPY %3
  RET implicit $r21r20, implicit $r23r22, implicit $r24

Then the for some reason the copy instructions for 16bit registers and 8bit registers get mixed up.
This doesn't sound to hard to fix.

@shepmaster
Copy link
Member

It tries to return a struct with the fields initialized to zero?

That seems like it to me!

This doesn't sound to hard to fix.

Awesome!

@brainlag
Copy link

brainlag commented Mar 13, 2018

~~~The patch is pretty messy because my C++ is in a really rusty state and I'm unable to reverse the const vector of the source registers.~~~

~~~But here we go [gist](https://gist.github.com/brainlag/309d47cd78aedc081773d7e14a65468d):~~~


```diff
diff --git a/lib/Target/AVR/AVRISelLowering.cpp b/lib/Target/AVR/AVRISelLowering.cpp
index 7ac8a136e6b..4f7d3d6a815 100644
--- a/lib/Target/AVR/AVRISelLowering.cpp
+++ b/lib/Target/AVR/AVRISelLowering.cpp
@@ -1371,16 +1371,12 @@ AVRTargetLowering::LowerReturn(SDValue Chain, CallingConv::ID CallConv,
   MachineFunction &MF = DAG.getMachineFunction();
   unsigned e = RVLocs.size();
 
-  // Reverse splitted return values to get the "big endian" format required
-  // to agree with the calling convention ABI.
-  if (e > 1) {
-    std::reverse(RVLocs.begin(), RVLocs.end());
-  }
-
   SDValue Flag;
   SmallVector<SDValue, 4> RetOps(1, Chain);
   // Copy the result values into the output registers.
-  for (unsigned i = 0; i != e; ++i) {
+  // Reverse loop to get the "big endian" format required
+  // to agree with the calling convention ABI.
+  for (int i = (e - 1); i < 0; --i) {
     CCValAssign &VA = RVLocs[i];
     assert(VA.isRegLoc() && "Can only return in registers!");
 
diff --git a/test/CodeGen/AVR/rust-avr-bug-92.ll b/test/CodeGen/AVR/rust-avr-bug-92.ll
new file mode 100644
index 00000000000..f93bf7a152d
--- /dev/null
+++ b/test/CodeGen/AVR/rust-avr-bug-92.ll
@@ -0,0 +1,6 @@
+; RUN: llc < %s -march=avr | FileCheck %s
+
+define { i8, i32 } @chapstick() {
+start:
+  ret { i8, i32 } zeroinitializer
+}
\ No newline at end of file
```

@shepmaster
Copy link
Member

Can you narrow down what exactly the problem is? Is it just that the condition (if (e > 1)) has been removed? Is it that the call to std::reverse is "broken"?

I wonder if it would still work if the only change was removing if (e > 1) {.

@brainlag
Copy link

Damn, I should have run the tests before. It breaks a lot of stuff. It made so much sense.

@brainlag
Copy link

I think I know whats going on. The AVR couldn't handle returning structs like this at all (see #57).
It only expects to ever return an iXX and only when XX is >=32 it needs to reverse the registers because of the calling convention. Now for this bug where we have { i8, i32 } and registers {[0] = i8, [1] = i16, [2] = i16} it only should reverse registers 1 and 2 but not 0.

@brainlag
Copy link

brainlag commented Mar 15, 2018

I don't know if this produces correct output but it prevents the crash. This patch just avoids to return a struct through registers. Which only works for a few/small structs anyway.

https://gist.github.com/brainlag/4d9217eefcac1f06df682415975d97c3

diff --git a/lib/Target/AVR/AVRISelLowering.cpp b/lib/Target/AVR/AVRISelLowering.cpp
index 7ac8a136e6b..e0bb2f317d1 100644
--- a/lib/Target/AVR/AVRISelLowering.cpp
+++ b/lib/Target/AVR/AVRISelLowering.cpp
@@ -1346,7 +1346,7 @@ AVRTargetLowering::CanLowerReturn(CallingConv::ID CallConv,
   CCState CCInfo(CallConv, isVarArg, MF, RVLocs, Context);
 
   auto CCFunction = CCAssignFnForReturn(CallConv);
-  return CCInfo.CheckReturn(Outs, CCFunction);
+  return !MF.getFunction().getReturnType()->isStructTy() && CCInfo.CheckReturn(Outs, CCFunction);
 }
 
 SDValue
diff --git a/test/CodeGen/AVR/rust-avr-bug-92.ll b/test/CodeGen/AVR/rust-avr-bug-92.ll
new file mode 100644
index 00000000000..e76ce630c67
--- /dev/null
+++ b/test/CodeGen/AVR/rust-avr-bug-92.ll
@@ -0,0 +1,7 @@
+; RUN: llc < %s -march=avr | FileCheck %s
+
+; CHECK-LABEL: main
+define { i8, i32 } @main() {
+start:
+  ret { i8, i32 } zeroinitializer
+}
\ No newline at end of file

@dylanmckay
Copy link
Member Author

I think I know whats going on. The AVR couldn't handle returning structs like this at all (see #57).
It only expects to ever return an iXX and only when XX is >=32 it needs to reverse the registers because of the calling convention. Now for this bug where we have { i8, i32 } and registers {[0] = i8, [1] = i16, [2] = i16} it only should reverse registers 1 and 2 but not 0.

This makes a lot of sense. Sounds like we need to adjust this logic and write some more calling convention tests.

I don't know if this produces correct output but it prevents the crash.
This patch just avoids to return a struct through registers. Which only works for
a few/small structs anyway.

I suspect that this patch will break the calling convention for all functions which return structs, as they will now always be passed on the stack. FWIW, code compiled with this patch will work with code compiled under the same patch, but it would not work when compiling against objects from an older version of LLVM or AVR-GCC generated objects.

I am fairly busy the next handful of days, but after that this bug is at the top of my priority list. If anybody wants to look at this in the interim (please do!), feel free to ask questions on this issue, or ping me on Gitter.

@dylanmckay
Copy link
Member Author

In the past to fix calling convention issues like this, I've taken a C file and compared it side-by-side with LLVM/GCC compiled code.

I've found that this is the best way to make sure our calling convention implementation is accurate. It looks like we are missing struct-return tests, which would be good to add.

@brainlag
Copy link

I think I know whats going on. The AVR couldn't handle returning structs like this at all (see #57).
It only expects to ever return an iXX and only when XX is >=32 it needs to reverse the registers because of the calling convention. Now for this bug where we have { i8, i32 } and registers {[0] = i8, [1] = i16, [2] = i16} it only should reverse registers 1 and 2 but not 0.

This makes a lot of sense. Sounds like we need to adjust this logic and write some more calling convention tests.

I have code for this, but it felt like a ugly hack to me and after testing with a couple of different structs I found it really hard to find structs which don't use the stack. If a struct doesn't fit within 3 (or maybe for 4?) registered it used the stack even without this patch.

I update with examples later.

@brainlag
Copy link

brainlag commented Mar 20, 2018

I tried it again. If the struct is <= 64 bit you can pass it without the stack.

Examples
; RUN: llc < %s -march=avr | FileCheck %s

; CHECK-LABEL: retstruct1
define { i8, i32 } @retstruct1() {
start:
  ret { i8, i32 } zeroinitializer
}

; CHECK-LABEL: retstruct2
define { i16, i32 } @retstruct2() {
start:
  ret { i16, i32 } zeroinitializer
}

; CHECK-LABEL: retstruct3
define { i32, i32 } @retstruct3() {
start:
  ret { i32, i32 } zeroinitializer
}

; CHECK-LABEL: retstruct4
define { i8, i64 } @retstruct4() {
start:
  ret { i8, i64 } zeroinitializer
}

; CHECK-LABEL: retstruct5
define { i8, i32, i16 } @retstruct5() {
start:
  ret { i8, i32, i16 } zeroinitializer
}

; CHECK-LABEL: retstruct6
define { i8, i32, i32 } @retstruct6() {
start:
  ret { i8, i32, i32 } zeroinitializer
}
The resulting asm code
	.text
	.file	"rust-avr-bug-92.1.ll"
	.globl	retstruct1              ; -- Begin function retstruct1
	.p2align	1
	.type	retstruct1,@function
retstruct1:                             ; @retstruct1
; %bb.0:                                ; %start
	ldi	r22, 0
	ldi	r23, 0
	ldi	r24, 0
	mov	r20, r22
	mov	r21, r23
	ret
.Lfunc_end0:
	.size	retstruct1, .Lfunc_end0-retstruct1
                                        ; -- End function
	.globl	retstruct2              ; -- Begin function retstruct2
	.p2align	1
	.type	retstruct2,@function
retstruct2:                             ; @retstruct2
; %bb.0:                                ; %start
	ldi	r24, 0
	ldi	r25, 0
	mov	r22, r24
	mov	r23, r25
	mov	r20, r24
	mov	r21, r25
	ret
.Lfunc_end1:
	.size	retstruct2, .Lfunc_end1-retstruct2
                                        ; -- End function
	.globl	retstruct3              ; -- Begin function retstruct3
	.p2align	1
	.type	retstruct3,@function
retstruct3:                             ; @retstruct3
; %bb.0:                                ; %start
	ldi	r24, 0
	ldi	r25, 0
	mov	r22, r24
	mov	r23, r25
	mov	r20, r24
	mov	r21, r25
	mov	r18, r24
	mov	r19, r25
	ret
.Lfunc_end2:
	.size	retstruct3, .Lfunc_end2-retstruct3
                                        ; -- End function
	.globl	retstruct4              ; -- Begin function retstruct4
	.p2align	1
	.type	retstruct4,@function
retstruct4:                             ; @retstruct4
; %bb.0:                                ; %start
	mov	r30, r24
	mov	r31, r25
	ldi	r24, 0
	ldi	r25, 0
	std	Z+7, r24
	std	Z+8, r25
	std	Z+5, r24
	std	Z+6, r25
	std	Z+3, r24
	std	Z+4, r25
	ldi	r18, 0
	st	Z+, r18
	st	Z, r24
	std	Z+1, r25
	ret
.Lfunc_end3:
	.size	retstruct4, .Lfunc_end3-retstruct4
                                        ; -- End function
	.globl	retstruct5              ; -- Begin function retstruct5
	.p2align	1
	.type	retstruct5,@function
retstruct5:                             ; @retstruct5
; %bb.0:                                ; %start
	ldi	r22, 0
	ldi	r23, 0
	ldi	r24, 0
	mov	r20, r22
	mov	r21, r23
	mov	r18, r22
	mov	r19, r23
	ret
.Lfunc_end4:
	.size	retstruct5, .Lfunc_end4-retstruct5
                                        ; -- End function
	.globl	retstruct6              ; -- Begin function retstruct6
	.p2align	1
	.type	retstruct6,@function
retstruct6:                             ; @retstruct6
; %bb.0:                                ; %start
	mov	r30, r24
	mov	r31, r25
	ldi	r24, 0
	ldi	r25, 0
	std	Z+7, r24
	std	Z+8, r25
	std	Z+5, r24
	std	Z+6, r25
	std	Z+3, r24
	std	Z+4, r25
	ldi	r18, 0
	st	Z+, r18
	st	Z, r24
	std	Z+1, r25
	ret
.Lfunc_end5:
	.size	retstruct6, .Lfunc_end5-retstruct6
                                        ; -- End function

	; Declaring this symbol tells the CRT that it should
	;copy all variables from program memory to RAM on startup
	.globl	__do_copy_data
	; Declaring this symbol tells the CRT that it should
	;clear the zeroed data section on startup
	.globl	__do_clear_bss
The patch which fixes the register ordering
diff --git a/lib/Target/AVR/AVRISelLowering.cpp b/lib/Target/AVR/AVRISelLowering.cpp
index e0bb2f317d1..4dc7efaa51c 100644
--- a/lib/Target/AVR/AVRISelLowering.cpp
+++ b/lib/Target/AVR/AVRISelLowering.cpp
@@ -1374,7 +1374,24 @@ AVRTargetLowering::LowerReturn(SDValue Chain, CallingConv::ID CallConv,
   // Reverse splitted return values to get the "big endian" format required
   // to agree with the calling convention ABI.
   if (e > 1) {
-    std::reverse(RVLocs.begin(), RVLocs.end());
+    // some hackery because SelectionDAGBuilder does not split up arguments properly
+    Type *retType = MF.getFunction().getReturnType();
+    if (retType->isStructTy()) {
+      if (retType->getNumContainedTypes() > 1 && retType->getNumContainedTypes() > e) {
+        for (unsigned i = 0, pos = 0; i < retType->getNumContainedTypes(); ++i) {
+          Type *field = retType->getContainedType(i);
+          if(field->isIntegerTy() && field->getIntegerBitWidth() > 16) {
+            int Size = field->getIntegerBitWidth() / 16;
+            std::reverse(RVLocs.begin()+ pos, RVLocs.end() + pos + Size);
+            pos += Size;
+          } else {
+            pos++;
+          }
+        }
+      }
+    } else {
+      std::reverse(RVLocs.begin(), RVLocs.end());
+    }
   }
 
   SDValue Flag;

@dylanmckay
Copy link
Member Author

Great work!

I will review the patch in the next few days.

@jhwgh1968
Copy link

jhwgh1968 commented Apr 28, 2018

Since @dylanmckay has been busy, and I've been trying to get engaged in this projct, I applied the patch and tried to compile libcore.

After working around #95 by lowering the optimization level, I got a new crash:

rustc: /home/admin/Software/avr-rust/src/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:8736: void llvm::SelectionDAGISel::LowerArguments(const llvm::Function&): Assertion `InVals.size() == Ins.size() && "LowerFormalArguments didn't emit the correct number of values!"' failed.
(gdb) bt
#0  0x00007ffff71f8860 in raise () from /usr/lib/libc.so.6
#1  0x00007ffff71f9ec9 in abort () from /usr/lib/libc.so.6
#2  0x00007ffff71f10bc in __assert_fail_base () from /usr/lib/libc.so.6
#3  0x00007ffff71f1133 in __assert_fail () from /usr/lib/libc.so.6
#4  0x00007fffeca26c14 in llvm::SelectionDAGISel::LowerArguments (
    this=this@entry=0x7fffb61d9000, F=...)
    at /home/admin/Software/avr-rust/src/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:8735
#5  0x00007fffeca77d6e in llvm::SelectionDAGISel::SelectAllBasicBlocks (
    this=this@entry=0x7fffb61d9000, Fn=...)
    at /home/admin/Software/avr-rust/src/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:1402
#6  0x00007fffeca78d72 in llvm::SelectionDAGISel::runOnMachineFunction (
    this=<optimized out>, mf=...)
    at /home/admin/Software/avr-rust/src/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:466
#7  0x00007fffecce12c5 in llvm::MachineFunctionPass::runOnFunction (
    this=0x7fffb61d9000, F=...)
    at /home/admin/Software/avr-rust/src/llvm/lib/CodeGen/MachineFunctionPass.cpp:62
(gdb) l 8735
8730          DAG.getRoot(), F.getCallingConv(), F.isVarArg(), Ins, dl, DAG, InVals);
8731
8732      // Verify that the target's LowerFormalArguments behaved as expected.
8733      assert(NewRoot.getNode() && NewRoot.getValueType() == MVT::Other &&
8734             "LowerFormalArguments didn't return a valid chain!");
8735      assert(InVals.size() == Ins.size() &&
8736             "LowerFormalArguments didn't emit the correct number of values!");
8737      DEBUG({
8738          for (unsigned i = 0, e = Ins.size(); i != e; ++i) {
8739            assert(InVals[i].getNode() &&

I poked around a bit, but as ignorant as I am about LLVM, it wasn't very insightful.

EDIT: I guess one thing is worth adding. If I counted fields correctly, the length of InVals is 8, while the length of Ins is 16.

@shepmaster
Copy link
Member

as ignorant as I am about LLVM

Welcome to the club! 😸

@jhwgh1968
Copy link

Welcome to the club! 😸

Thanks! 😄

In that case, does @brainlag have any more ideas?

@brainlag
Copy link

Are you sure picked up the patch for #57?

@jhwgh1968
Copy link

I just checked, @brainlag, and the patch for #57 is the very last commit I have. I am using @shepmaster's rebase here.

@brainlag
Copy link

brainlag commented May 1, 2018

Ok, you should pin the problem down by making a reduced testcase. Then create a new bug with the LLVM IR code that triggers the bug.

@jhwgh1968
Copy link

Whoops! It seems that I was using the wrong version of llc. Nevermind.

Thanks for the link about reduced testcases, though. I always wondered how those got generated. 😄

@shepmaster
Copy link
Member

@brainlag to double-check: should both this patch and this patch be applied, or does the second one replace the first?

dylanmckay added a commit to dylanmckay/llvm that referenced this issue Nov 5, 2018
…uired by the ABI

When returning structs, the fields need to be reversed. Without this,
the ABI of the generated executables is not compatible with avr-gcc,
nor will LLVM allow it to be compiled.

Here is the assertion message when complex structs are used:

    Impossible reg-to-reg copy
    UNREACHABLE executed at llvm/lib/Target/AVR/AVRInstrInfo.cpp!

This was first noticed in avr-rust/rust-legacy-fork#92.

Description by Peter

    The AVR couldn't handle returning structs like this at all (see avr-rust/rust-legacy-fork#57).
    It only expects to ever return an iXX and only when XX is >=32 it needs to reverse
    the registers because of the calling convention. Now for this bug where we have
    { i8, i32 } and registers {[0] = i8, [1] = i16, [2] = i16} it only should reverse
    registers 1 and 2 but not 0.

Patch by Peter Nimmervoll (@brainlag) and Tim Neumann (@TimNN)
@dylanmckay
Copy link
Member Author

I went to upstream this, but I noticed there are a few test failures (call.ll, div.ll, rem.ll).

I made a few changes (added the new AVR doc to the docs index, fixed the failed tests so that they match what LLVM generates with the patch) and pushed it up onto dylanmckay/llvm@upstream-fix-avr-rust-92.

It looks like the patch changes the ABI, not just for structs, but also for indirect calls to function pointers returning sizes >16-bits.

The patch also changes the ABI of the libgcc/compiler-rt calls for division and remainder. You can see the changes in the below patch in the update to div.ll and rem.ll. As far as I can tell, this will either completely break div and mod or completely fix if it is the case it is already broken. It looks like the tests for sdiv32 et al have been around since late 2016, so it's likely they're correct.

commit 3c474f31c49643952326eae05fdfa2be4a7bff88
Author: Dylan McKay <[email protected]>
Date:   Mon Nov 5 19:00:11 2018 +1300

    [AVR] Reverse split return values to into the "big endian" format required by the ABI
    
    When returning structs, the fields need to be reversed. Without this,
    the ABI of the generated executables is not compatible with avr-gcc,
    nor will LLVM allow it to be compiled.
    
    Here is the assertion message when complex structs are used:
    
        Impossible reg-to-reg copy
        UNREACHABLE executed at llvm/lib/Target/AVR/AVRInstrInfo.cpp!
    
    This was first noticed in avr-rust/rust#92.
    
    Description by Peter
    
        The AVR couldn't handle returning structs like this at all (see avr-rust/rust#57).
        It only expects to ever return an iXX and only when XX is >=32 it needs to reverse
        the registers because of the calling convention. Now for this bug where we have
        { i8, i32 } and registers {[0] = i8, [1] = i16, [2] = i16} it only should reverse
        registers 1 and 2 but not 0.
    
    Patch by Peter Nimmervoll (@brainlag) and Tim Neumann (@TimNN)

diff --git a/test/CodeGen/AVR/call.ll b/test/CodeGen/AVR/call.ll
index a2556e8c1e6..4a1de6eb603 100644
--- a/test/CodeGen/AVR/call.ll
+++ b/test/CodeGen/AVR/call.ll
@@ -172,15 +172,17 @@ define void @testcallprologue() {
 define i32 @icall(i32 (i32)* %foo) {
 ; CHECK-LABEL: icall:
 ; CHECK: movw r30, r24
-; CHECK: ldi r22, 147
-; CHECK: ldi r23, 248
-; CHECK: ldi r24, 214
-; CHECK: ldi r25, 198
-; CHECK: icall
-; CHECK: subi r22, 251
-; CHECK: sbci r23, 255
-; CHECK: sbci r24, 255
-; CHECK: sbci r25, 255
+; CHECK-NEXT: ldi r22, 147
+; CHECK-NEXT: ldi r23, 248
+; CHECK-NEXT: ldi r24, 214
+; CHECK-NEXT: ldi r25, 198
+; CHECK-NEXT: icall
+; CHECK-NEXT: movw r18, r22
+; CHECK-NEXT: subi r24, 251
+; CHECK-NEXT: sbci r25, 255
+; CHECK-NEXT: sbci r18, 255
+; CHECK-NEXT: sbci r19, 255
+
   %1 = call i32 %foo(i32 3335977107)
   %2 = add nsw i32 %1, 5
   ret i32 %2
@@ -200,10 +202,12 @@ define i32 @externcall(float %a, float %b) {
 ; CHECK: movw r20, [[REG1]]
 ; CHECK: call __divsf3
 ; CHECK: call foofloat
-; CHECK: subi r22, 251
-; CHECK: sbci r23, 255
-; CHECK: sbci r24, 255
+; CHECK: movw r18, r22
+; CHECK: subi r24, 251
 ; CHECK: sbci r25, 255
+; CHECK: sbci r18, 255
+; CHECK: sbci r19, 255
+
   %1 = fdiv float %b, %a
   %2 = call i32 @foofloat(float %1)
   %3 = add nsw i32 %2, 5
diff --git a/test/CodeGen/AVR/div.ll b/test/CodeGen/AVR/div.ll
index 7626ecb8172..7ff8ae33456 100644
--- a/test/CodeGen/AVR/div.ll
+++ b/test/CodeGen/AVR/div.ll
@@ -44,8 +44,9 @@ define i16 @sdiv16(i16 %a, i16 %b) {
 define i32 @udiv32(i32 %a, i32 %b) {
 ; CHECK-LABEL: udiv32:
 ; CHECK: call __udivmodsi4
-; CHECK-NEXT: movw r22, r18
-; CHECK-NEXT: movw r24, r20
+; CHECK-NEXT: movw r18, r22
+; CHECK-NEXT: movw r22, r24
+; CHECK-NEXT: movw r24, r18
 ; CHECK-NEXT: ret
   %quot = udiv i32 %a, %b
   ret i32 %quot
@@ -55,8 +56,9 @@ define i32 @udiv32(i32 %a, i32 %b) {
 define i32 @sdiv32(i32 %a, i32 %b) {
 ; CHECK-LABEL: sdiv32:
 ; CHECK: call __divmodsi4
-; CHECK-NEXT: movw r22, r18
-; CHECK-NEXT: movw r24, r20
+; CHECK-NEXT: movw r18, r22
+; CHECK-NEXT: movw r22, r24
+; CHECK-NEXT: movw r24, r18
 ; CHECK-NEXT: ret
   %quot = sdiv i32 %a, %b
   ret i32 %quot
diff --git a/test/CodeGen/AVR/rem.ll b/test/CodeGen/AVR/rem.ll
index 47573e8dafc..eaa12b79774 100644
--- a/test/CodeGen/AVR/rem.ll
+++ b/test/CodeGen/AVR/rem.ll
@@ -42,6 +42,8 @@ define i16 @srem16(i16 %a, i16 %b) {
 define i32 @urem32(i32 %a, i32 %b) {
 ; CHECK-LABEL: urem32:
 ; CHECK: call __udivmodsi4
+; CHECK: movw r22, r20
+; CHECK: movw r24, r18
 ; CHECK-NEXT: ret
   %rem = urem i32 %a, %b
   ret i32 %rem
@@ -51,6 +53,8 @@ define i32 @urem32(i32 %a, i32 %b) {
 define i32 @srem32(i32 %a, i32 %b) {
 ; CHECK-LABEL: srem32:
 ; CHECK: call __divmodsi4
+; CHECK: movw r22, r20
+; CHECK: movw r24, r18
 ; CHECK-NEXT: ret
   %rem = srem i32 %a, %b
   ret i32 %rem

@dylanmckay
Copy link
Member Author

I have a suspicion

Functions with return types satisfying retType->getNumContainedTypes() > 1 && retType->getNumContainedTypes() > RVLocs.size() are no longer reversed

Unsure if this is intentional.

diff --git a/lib/Target/AVR/AVRISelLowering.cpp b/lib/Target/AVR/AVRISelLowering.cpp
index 8b7282f0c71..169c8e5cda2 100644
--- a/lib/Target/AVR/AVRISelLowering.cpp
+++ b/lib/Target/AVR/AVRISelLowering.cpp
@@ -1303,40 +1303,44 @@ SDValue AVRTargetLowering::LowerCall(TargetLowering::CallLoweringInfo &CLI,
 static void ReverseArgumentsToBigEndian(MachineFunction &MF,
                                         SmallVector<CCValAssign, 16> &RVLocs) {
   if (RVLocs.size() > 1) {
     // Some hackery because SelectionDAGBuilder does not split
     // up arguments properly
     Type *retType = MF.getFunction().getReturnType();
     if (retType->isStructTy()) {
       if (retType->getNumContainedTypes() > 1 &&
           retType->getNumContainedTypes() > RVLocs.size()) {
         for (unsigned i = 0, pos = 0;
             i < retType->getNumContainedTypes(); ++i) {
           Type *field = retType->getContainedType(i);
           if(field->isIntegerTy() && field->getIntegerBitWidth() > 16) {
             int Size = field->getIntegerBitWidth() / 16;
             std::reverse(RVLocs.begin()+ pos, RVLocs.end() + pos + Size);
             pos += Size;
           } else {
             pos++;
           }
         }
+      } else {
+        // FIXME: In the old version of this logic, this else branch
+        // would still reverse the return values. Old logic reversed if and
+        // always if RVLocs.size() > 1
+        // std::reverse(RVLocs.begin(), RVLocs.end()); ??
       }
     } else {
       std::reverse(RVLocs.begin(), RVLocs.end());
     }
   }
 }

@brainlag
Copy link

brainlag commented Nov 5, 2018

So this means you hitting the struct path without having any structs?

@dylanmckay
Copy link
Member Author

It seems it, yeah. Unless there's something else going on like another location that needs to be reverse'd. I saw one last std::reverse call, but it was related to function arguments. I am unsure if that could be the problem though.

shepmaster pushed a commit to avr-rust/llvm that referenced this issue Nov 21, 2018
@carlos4242
Copy link

This is a very old thread. But I'm joining in. 😄 The stuff in here is mainlined in avr-llvm and now we're getting some fairly serious breaks that are affecting me. #130 seems to be caused by this. I think when you just get a normal 32 bit return value the words are getting swapped. I'm creating unit tests for this and I'll try to attach them here and to #130. But for example in a really simple case, if you have a function that returns a 32 bit int and at the end just calls another external function that returns a 32 bit int, it should have no statements between the call XXX and ret, because the value should be returned in r22-r25 from the extern function and then returned back immediately. In all cases it thinks things are the wrong way round so it swaps r22/r23 with r24/r25. That breaks the rem.ll, div.ll and call.ll tests in exactly that way. Plus it breaks swift for Arduino because my code was doing int16 multiplications and returning 0 every time, real world impact. lol.

Anyway, it's going to take me a long time to understand what's going on in this thread, let alone with the code because I'm such a n00b, @brainlag @shepmaster @dylanmckay you guys want to chip in?

@carlos4242
Copy link

carlos4242 commented Mar 8, 2019

OK, think I've found what's causing all the problems @dylanmckay. In various bits of consolidation, etc. code got accidentally deleted from LowerCallResult.

The patch should have been this:

@@ -1315,10 +1344,8 @@ SDValue AVRTargetLowering::LowerCallResult(
   auto CCFunction = CCAssignFnForReturn(CallConv);
   CCInfo.AnalyzeCallResult(Ins, CCFunction);
 
-  if (CallConv != CallingConv::AVR_BUILTIN && RVLocs.size() > 1) {
-    // Reverse splitted return values to get the "big endian" format required
-    // to agree with the calling convention ABI.
-    std::reverse(RVLocs.begin(), RVLocs.end());
+  if (CallConv != CallingConv::AVR_BUILTIN) {
+    ReverseArgumentsToBigEndian(DAG.getMachineFunction(), RVLocs);
   }

But ended up as this:

   auto CCFunction = CCAssignFnForReturn(CallConv);
   CCInfo.AnalyzeCallResult(Ins, CCFunction);
 
-  if (CallConv != CallingConv::AVR_BUILTIN && RVLocs.size() > 1) {
-    // Reverse splitted return values to get the "big endian" format required
-    // to agree with the calling convention ABI.
-    std::reverse(RVLocs.begin(), RVLocs.end());
-   }

So ReverseArgumentsToBigEndian is never called from LowerCallResult. I'm pretty sure this is just an accident. I'll do a small PR over the weekend with some unit tests.

@carlos4242
Copy link

carlos4242 commented Mar 8, 2019

Background, I think the way the code was working was this:

For results coming back from calls or being returned from a function that are i8 or i16, the standard register mapping works. This will give you r24 or r24/r25 as a pair.

If you had a result coming back from a call or function that was i32 or i64, SelectionDAGBuilder would ask "how many registers and what types will we use?" and get back "2 registers of i16" (or "4 registers of i16")
These would be naively assigned using RetCC_AVR from AVRCallingConv.td as register 0 (the low 16 bits): r24/r25, register 1 (the high 16 bits): r22/r23, which is word swapped. To fix that, if there were more than two registers in a return value, the old code reversed them and they got assigned the other way around so it all worked, but only for returning simple scalar types. I'm not sure they had ever got struct type returns working. If this was largely unchanged from the old code on Source Forge, that code would have probably only ever been tested on LLVM IR from clang, which probably wouldn't often (ever?) produce a struct return.

@carlos4242
Copy link

OK, I've come up with a patch: https://gist.github.com/carlos4242/f882c9f1213ef006feb6984acef291d7

@crlf0710
Copy link

Built the official 8.0 release code with AVR target enabled, llced the code in OP, and llc segfaulted.

dylanmckay pushed a commit to avr-rust/llvm-project that referenced this issue Jun 8, 2019
Lazarus535 pushed a commit to Lazarus535/llvm-project that referenced this issue Oct 3, 2019
@Lazarus535
Copy link

Lazarus535 commented Oct 9, 2019

I built the LLVM 9.0 from the official rust repo (branch: rustc/9.0-2019-09-19) with just this fix (f1630e5) and compiling the blink example segfaulted.
In this line:

std::reverse(RVLocs.begin()+ pos, RVLocs.end() + pos + Size);

shouldn't it be RVLocs.begin() in both places? Adding to .end() and calling std functions is usually UB, or is it?

@carlos4242
Copy link

It's not a very useful answer but I've never got my head round this code when it comes to what happens with a return type of struct. My guess is what you are saying looks right though. I think you can try it and see if the resulting assembly code gives you the correct ABI machine code when you compile a function with a struct return value, also try a few different types of struct, with different sized members. A simple test maybe using clang with this backend on suitable c test programs and comparing it to the output produced by avr-gcc on the same c test program. The two should definitely be compatible because the AVR llvm ABI is meant to be exactly compatible with the GCC ABI. But ultimately, if you can't face doing this or are worried about edge cases, you might have to wait for @dylanmckay to pick up the issue and take a look. I think he's the author of the patch for handling the struct return type.

@Lazarus535
Copy link

Lazarus535 commented Nov 8, 2019

I've been working on a fix for this issue, as well as some other fixes for some problems related to register allocation in general.

See: https://github.com/Lazarus535/llvm-project/commit/60d76b6d5427455c2ea95bb5787b564c1d215820

Everything works fine for returning structures now and it is compliant with GCC so far.
See this test: llvm/test/CodeGen/AVR/calling-conv/c/ret.ll

I however have one issue remaining:
When retrieving a struct from a function call i get a compiler crash and i don't know why...
See this test: llvm/test/CodeGen/AVR/calling-conv/c/call_ret.ll

The error relates to the following problem:
When returning a struct, it is sometimes necessary to split a i16 register into two i8 registers because it does not align with a predefined i16 value register (e.g. R2524 or R2322). This is because of the GCC calling convention.

The splitting of a register works fine.
See function splitRegistersIfMissalignedRet
In: llvm/lib/Target/AVR/AVRISelLowering.cpp

However the remerging in function LowerCallResult
In: llvm/lib/Target/AVR/AVRISelLowering.cpp
failes.

I lack the deep insight into how LLVM works precisely, but i gave it a shot anyway.
Maybe someone can give me a hint...
I suspect, that the solution is very close.

Thanks everybody.

@carlos4242
Copy link

carlos4242 commented Nov 8, 2019 via email

@Lazarus535
Copy link

Please check out this Compiler Explorer Link to understand what i mean with splitting:
https://godbolt.org/z/6tq_Vz

@dylanmckay
Copy link
Member Author

I suspect this would be fixed by https://reviews.llvm.org/D68524

@carlos4242
Copy link

carlos4242 commented Nov 26, 2019 via email

@dylanmckay
Copy link
Member Author

Can confirm this is fixed by D68524.

dylanmckay added a commit to avr-rust/llvm-project that referenced this issue Nov 27, 2019
This fixes avr-rust/rust-legacy-fork#92.

At the time of cherry-picking, this patch has not yet been upstreamed.
dylanmckay added a commit that referenced this issue Nov 27, 2019
…onvention"

This fixes #92.

At the time of cherry-picking, this patch has not yet been upstreamed.
@dylanmckay dylanmckay changed the title [LLVM 6.0] UNREACHABLE Impossible reg-to-reg copy UNREACHABLE Impossible reg-to-reg copy Jun 19, 2020
@dylanmckay
Copy link
Member Author

Can confirm that this is still fixed when occurring on Rust trunk with https://reviews.llvm.org/D68524.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-libcore Affects compiling the core library A-llvm Affects the LLVM AVR backend has-local-patch A patch exists but has not been applied to upstream LLVM has-reduced-testcase A small LLVM IR file exists that demonstrates the problem
Projects
None yet
Development

No branches or pull requests

9 participants