Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CIR] [Lowering] [X86_64] Support VAArg for LongDouble #1150

Open
wants to merge 2,087 commits into
base: main
Choose a base branch
from

Conversation

ChuanqiXu9
Copy link
Member

Recommit #1101

I am not sure what happened. But that merged PR doesn't show in the git log. Maybe the stacked PR may not get successed? But after all, we need to land it again.

Following off are original commit messages:


This is the following of #1100.

After #1100, when we want to use LongDouble for VAArg, we will be in trouble due to details in X86_64's ABI and this patch tries to address this.

The practical impact the patch is, after this patch, with #1088 and a small following up fix, we can build and run all C's benchmark in SpecCPU 2017. I think it is a milestone.

smeenai and others added 30 commits November 2, 2024 23:31
Directly erasing the op causes a use after free later on, presumably
because the lowering framework isn't aware of the op being deleted. This
fixes `clang/test/CIR/CodeGen/pointer-arith-ext.c` with ASAN.
The loop was erasing the user of a value while iterating on the value's
users, which results in a use after free. We're already assuming (and
asserting) that there's only one user, so we can just access it directly
instead. CIR/Transforms/Target/x86_64/x86_64-call-conv-lowering-pass.cpp
was failing with ASAN before this change. We're now ASAN-clean except
for llvm#829 (which is also in
progress).
Reland llvm#638

This was reverted due to llvm#655. I
tried to address the problem in the newest commit.

The changes of the PR since the last landed one includes:
- Move the definition of `cir::CIRGenConsumer` to
`clang/include/clang/CIRFrontendAction/CIRGenConsumer.h`, and leave its
`HandleTranslationUnit` interface is left empty. So that
`cir::CIRGenConsumer` won't need to depend on CodeGen any more.
- Change the old definition of `cir::CIRGenConsumer` in
`clang/lib/CIR/FrontendAction/CIRGenAction.cpp` and to
`CIRLoweringConsumer`, inherited from `cir::CIRGenConsumer`, which
implements the original `HandleTranslationUnit` interface.

I feel this may improve the readability more even without my original
patch.
This PR fixes the lowering for multi dimensional arrays.

Consider the following code snippet `test.c`: 
```
void foo() {
  char arr[4][1] = {"a", "b", "c", "d"};
}
```

When ran with `bin/clang test.c -Xclang -fclangir -Xclang -emit-llvm -S
-o -`, It produces the following error:
```
~/clangir/llvm/include/llvm/Support/Casting.h:566: decltype(auto) llvm::cast(const From&) [with To = mlir::ArrayAttr; From = mlir::Attribute]: Assertion `isa<To>(Val) && "cast<Ty>() argument of incompatible type!"' failed.
```

The bug can be traced back to `LoweringHelpers.cpp`. It considers the
values in the array as integer types, and this causes an error in this
case.

This PR updates `convertToDenseElementsAttrImpl` when the array contains
string attributes. I have also added one more similar test. Note that in
the tests I used a **literal match** to avoid matching as regex, so
`!dbg` is useful.
Support expressions at the top level such as

const unsigned int n = 1234;
const int &r = (const int&)n;

Reviewers: bcardosolopes

Pull Request: llvm#857
Fix llvm#829

Thanks @smeenai for pointing out the root cause and UBSan failure!
As title.
Also introduced buildAArch64NeonCall skeleton, which is partially the
counterpart of OG's EmitNeonCall. And this could be use for many other
neon intrinsics.

---------

Co-authored-by: Guojin He <[email protected]>
These were uninitialized, which led to intermittent test failures from
the use of uninitialized variables. Initialize them to `nullptr` as is
done with other member variables that are pointers to fix this.

I did a quick spot-check and didn't find other uninitialized variables
in the main CGF class itself. Lots of subclasses have uninitialized
member variables, but those are presumably expected to be initialized at
all points of construction, so we can leave them alone until they cause
any issues.

`ninja check-clang-cir` now passes with ASan+UBSan and MSan.

Fixes llvm#829
This PR adds aarch64 big endian support.

Basically the support for aarch64_be itself is expressed only in two
extra cases for the switch statement and changes in the `CIRDataLayout`
are needed to prove that we really support big endian. Hence the idea
for the test - I think the best way for proof is something connected
with bit-fields, so we compare the results of the original codegen and
ours.
This PR splits the old `cir-simplify` pass into two new passes, namely
`cir-canonicalize` and `cir-simplify` (the new `cir-simplify`). The
`cir-canonicalize` pass runs transformations that do not affect
CIR-to-source fidelity much, such as operation folding and redundant
operation elimination. On the other hand, the new `cir-simplify` pass
runs transformations that may significantly change the code and break
high-level code analysis passes, such as more aggresive code
optimizations.

This PR also updates the CIR-to-CIR pipeline to fit these two new
passes. The `cir-canonicalize` pass is moved to the very front of the
pipeline, while the new `cir-simplify` pass is moved to the back of the
pipeline (but still before lowering prepare of course). Additionally,
the new `cir-simplify` now only runs when the user specifies a non-zero
optimization level on the frontend.

Also fixed some typos and resolved some `clang-tidy` complaints along
the way.

Resolves llvm#827 .
Currently the C style cast is not implemented/supported for unions.

This PR adds support for union casts as done in `CGExprAgg.cpp`. I have
also added an extra test in `union-init.c`.
Mistakenly closed llvm#850

llvm#850 (review)
 
This PR fixes array initialization for expression arguments. 

Consider the following code snippet `test.c`: 
```
typedef struct {
  int a;
  int b[2];
} A;

int bar() {
  return 42;
}

void foo() {
  A a = {bar(), {}};
}
```
When ran with `bin/clang test.c -Xclang -fclangir -Xclang -emit-cir -S
-o -`, It produces the following error:
```
~/clangir/clang/lib/CIR/CodeGen/CIRGenExprAgg.cpp:483: void {anonymous}::AggExprEmitter::buildArrayInit(cir::Address, mlir::cir::ArrayType, clang::QualType, clang::Expr*, llvm::ArrayRef<clang::Expr*>, clang::Expr*): Assertion `NumInitElements != 0' failed.
```
The error can be traced back to `CIRGenExprAgg.cpp`, and the fix is
simple. It is possible to have an empty array initialization as an
expression argument!
As title, if element type of vector type is sized, then the vector type
should be deemed sized.
This would enable us generate code for neon without triggering assertion
…eon_vrndaq_v (llvm#871)

as title. 
This also added NeonType support for Float32

Co-authored-by: Guojin He <[email protected]>
It will hit another assert when calling initFullExprCleanup.
This PR fixes the case, when a temporary var is used, and `alloca`
operation is inserted in the block start before the `label` operation.
Implementation: when we search for the `alloca` place in a block, we
take label operations into account as well.

Fix llvm#870

---------

Co-authored-by: Bruno Cardoso Lopes <[email protected]>
__attribute__((annotate()) was only accepting integer literals,
preventing some meta-programming usage for example.
This should be extended to some other kinds of types.

---------

Co-authored-by: Bruno Cardoso Lopes <[email protected]>
Just as the title says, but only covers non-exception path, that's
coming next.
Nothing unblocked yet, just hit next assert in the same path.
… exceptions

Code path still hits an assert sooner, incremental NFC step.
…lvm#878)

Close llvm#876

We've already considered the case that there are random stmt after a
switch case:

```
for (auto *c : compoundStmt->body()) {
      if (auto *switchCase = dyn_cast<SwitchCase>(c)) {
        res = buildSwitchCase(*switchCase, condType, caseAttrs);
      } else if (lastCaseBlock) {
        // This means it's a random stmt following up a case, just
        // emit it as part of previous known case.
        mlir::OpBuilder::InsertionGuard guardCase(builder);
        builder.setInsertionPointToEnd(lastCaseBlock);
        res = buildStmt(c, /*useCurrentScope=*/!isa<CompoundStmt>(c));
      } else {
        llvm_unreachable("statement doesn't belong to any case region, NYI");
      }

      lastCaseBlock = builder.getBlock();

      if (res.failed())
        break;
}
```

However, maybe this is an oversight, in the branch of ` if
(lastCaseBlock)`, the insertion point will be updated automatically when
the RAII object `guardCase` destroys, then we can assign the correct
value for `lastCaseBlock` later. So we will see the weird code pattern
in the issue side.

BTW, I found the codes in CIRGenStmt.cpp are far more less similar with
the ones other code gen places. Is this intentional? And what is the
motivation and guide lines here?
ghehg and others added 14 commits November 14, 2024 21:33
This is going to be raised in follow up work, which is hard to
do in one go because createBaseClassAddr goes of the OG skeleton
and ideally we want ApplyNonVirtualAndVirtualOffset to work naturally.

This also doesn't handle null checks, coming next.
Now that we fixed the dep on VBase, clean up the rest of the function.
It was always the intention for `cir.cmp` operations to return bool
result. Due
to missing constraints, a bug in codegen has slipped in which created
`cir.cmp`
operations with result type that matches the original AST expression
type. In
C, as opposed to C++, boolean expression types are "int". This resulted
with
extra operations being codegened around boolean expressions and their
usage.

This commit both enforces `cir.cmp` in the op definition and fixes the
mentioned bug.
…vm#1135)

support `llvm.intr.memset.inline` in llvm-project repo before we add
support for `__builtin_memset_inline` in clangir

cc @bcardosolopes

(cherry picked from commit 30753af)
This is the first patch to support TBAA, following the discussion at
llvm#1076 (comment)

- add skeleton for CIRGen, utilizing `decorateOperationWithTBAA`
- add empty implementation in `CIRGenTBAA`
- introduce `CIR_TBAAAttr` with empty body
- attach `CIR_TBAAAttr` to `LoadOp` and `StoreOp`
- no handling of vtable pointer
- no LLVM lowering
)

The title describes the purpose of the PR. It adds initial support for
structures with padding to the call convention lowering for AArch64.

I have also _initial support_ for the missing feature
[FinishLayout](https://github.com/llvm/clangir/blob/5c5d58402bebdb1e851fb055f746662d4e7eb586/clang/lib/AST/RecordLayoutBuilder.cpp#L786)
for records, and the logic is gotten from the original codegen.

Finally, I added a test for verification.
Copy link

github-actions bot commented Nov 21, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Collaborator

@smeenai smeenai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, looks like the stacked PR was incorrectly merged into a user branch instead of main. This looks good after you fix the formatting.

@ChuanqiXu9
Copy link
Member Author

Yeah, looks like the stacked PR was incorrectly merged into a user branch instead of main. This looks good after you fix the formatting.

Done

@smeenai
Copy link
Collaborator

smeenai commented Nov 21, 2024

The test failure looks related, though I don't know why it would only fail on Windows.

@ChuanqiXu9
Copy link
Member Author

It turns out to be a weird ordering mismatch. In my environment, I see:

cir.va.start %4 : !cir.ptr<!ty___va_list_tag> loc(#loc23)
    %5 = cir.cast(array_to_ptrdecay, %2 : !cir.ptr<!cir.array<!ty___va_list_tag x 1>>), !cir.ptr<!ty___va_list_tag> loc(#loc21)
    %6 = cir.get_member %5[2] {name = "overflow_arg_area"} : !cir.ptr<!ty___va_list_tag> -> !cir.ptr<!cir.ptr<!void>> loc(#loc21)
    %7 = cir.load %6 : !cir.ptr<!cir.ptr<!void>>, !cir.ptr<!void> loc(#loc21)
    %8 = cir.cast(bitcast, %7 : !cir.ptr<!void>), !cir.ptr<!u8i> loc(#loc21)
    %9 = cir.const #cir.int<15> : !u32i loc(#loc21)
    %10 = cir.ptr_stride(%8 : !cir.ptr<!u8i>, %9 : !u32i), !cir.ptr<!u8i> loc(#loc21)

where the case come before the const.

but the output in the above link is:

cir.va.start %4 : !cir.ptr<!ty___va_list_tag> loc(#loc23) 
# |             66:  %5 = cir.cast(array_to_ptrdecay, %2 : !cir.ptr<!cir.array<!ty___va_list_tag x 1>>), !cir.ptr<!ty___va_list_tag> loc(#loc21) 
# |             67:  %6 = cir.get_member %5[2] {name = "overflow_arg_area"} : !cir.ptr<!ty___va_list_tag> -> !cir.ptr<!cir.ptr<!void>> loc(#loc21) 
# |             68:  %7 = cir.load %6 : !cir.ptr<!cir.ptr<!void>>, !cir.ptr<!void> loc(#loc21) 
# |             69:  %8 = cir.const #cir.int<15> : !u32i loc(#loc21) 
# |             70:  %9 = cir.cast(bitcast, %7 : !cir.ptr<!void>), !cir.ptr<!u8i> loc(#loc21) 
# | check:122'0                                                  X~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
# |             71:  %10 = cir.ptr_stride(%9 : !cir.ptr<!u8i>, %8 : !u32i), !cir.ptr<!u8i> loc(#loc21) 

but now the const comes before the cast.

hmm.. although we can try to use DAG to check it. But I prefer to not debugging with CI (I don't have windows environment). So I prefer to skip this test on windows.

@bcardosolopes
Copy link
Member

But I prefer to not debugging with CI (I don't have windows environment). So I prefer to skip this test on windows.

Doesn't seem like we have another option? If DAG works why not use it? Disabling stuff for other platforms will create tech debt for no good reason here

@smeenai
Copy link
Collaborator

smeenai commented Nov 22, 2024

I think it's pretty strange that we're getting different IR output on different machines. We're specifying a target triple, so it should be fully deterministic, right?

@bcardosolopes
Copy link
Member

I think it's pretty strange that we're getting different IR output on different machines

True, if this is exposing non-deterministic behavior, looks like the right opportunity to understand and fix (or at least if we understand and it's more complex an issue can be created and this could be addressed on a follow up PR)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.