Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wrong value for BOOL_WIDTH #117348

Closed
gustedt opened this issue Nov 22, 2024 · 10 comments · Fixed by #117364
Closed

wrong value for BOOL_WIDTH #117348

gustedt opened this issue Nov 22, 2024 · 10 comments · Fixed by #117364
Assignees
Labels
c23 clang:frontend Language frontend issues, e.g. anything involving "Sema" confirmed Verified by a second party

Comments

@gustedt
Copy link

gustedt commented Nov 22, 2024

C23 has a new feature test macro BOOL_WIDTH which is fixed by the standard to the value 1.
Clang has it as 8 in v. 14 to 20, gcc has the correct value.

Test this by compiling with -std=c2x

#include <limits.h>
static_assert(BOOL_WIDTH == 1);

Thanks
Jens

@AaronBallman AaronBallman added good first issue https://github.com/llvm/llvm-project/contribute c23 confirmed Verified by a second party labels Nov 22, 2024
@llvmbot
Copy link
Member

llvmbot commented Nov 22, 2024

Hi!

This issue may be a good introductory issue for people new to working on LLVM. If you would like to work on this issue, your first steps are:

  1. Check that no other contributor has already been assigned to this issue. If you believe that no one is actually working on it despite an assignment, ping the person. After one week without a response, the assignee may be changed.
  2. In the comments of this issue, request for it to be assigned to you, or just create a pull request after following the steps below. Mention this issue in the description of the pull request.
  3. Fix the issue locally.
  4. Run the test suite locally. Remember that the subdirectories under test/ create fine-grained testing targets, so you can e.g. use make check-clang-ast to only run Clang's AST tests.
  5. Create a Git commit.
  6. Run git clang-format HEAD~1 to format your changes.
  7. Open a pull request to the upstream repository on GitHub. Detailed instructions can be found in GitHub's documentation. Mention this issue in the description of the pull request.

If you have any further questions about this issue, don't hesitate to ask via a comment in the thread below.

@llvmbot
Copy link
Member

llvmbot commented Nov 22, 2024

@llvm/issue-subscribers-good-first-issue

Author: Jens Gustedt (gustedt)

C23 has a new feature test macro BOOL_WIDTH which is fixed by the standard to the value 1. Clang has it as 8 in v. 14 to 20, gcc has the correct value.

Test this by compiling with -std=c2x

#include &lt;limits.h&gt;
static_assert(BOOL_WIDTH == 1);

Thanks
Jens

@AaronBallman AaronBallman added clang:frontend Language frontend issues, e.g. anything involving "Sema" and removed good first issue https://github.com/llvm/llvm-project/contribute new issue labels Nov 22, 2024
@llvmbot
Copy link
Member

llvmbot commented Nov 22, 2024

@llvm/issue-subscribers-clang-frontend

Author: Jens Gustedt (gustedt)

C23 has a new feature test macro BOOL_WIDTH which is fixed by the standard to the value 1. Clang has it as 8 in v. 14 to 20, gcc has the correct value.

Test this by compiling with -std=c2x

#include &lt;limits.h&gt;
static_assert(BOOL_WIDTH == 1);

Thanks
Jens

@AaronBallman
Copy link
Collaborator

https://github.com/llvm/llvm-project/blob/main/clang/lib/Frontend/InitPreprocessor.cpp#L1106 is where we define the macro, and it's based on what each target reports. However, changing that value also impacts ASTContext::getTypeInfo() which is called in quite a few places; we'd have to investigate whether changing the value makes a difference in those places or not, so it may not be a good first issue (but it may still be a trivial change for us).

Note, Clang's current behavior is conforming. C23 5.2.5.3.2p1 specifies the minimum values but they can be replaced by implementation-defined values. I confirmed the issue because I don't know of a good reason why we should diverge from GCC's behavior.

@gustedt
Copy link
Author

gustedt commented Nov 22, 2024

Note, Clang's current behavior is conforming. C23 5.2.5.3.2p1 specifies the minimum values but they can be replaced by > implementation-defined values. I confirmed the issue because I don't know of a good reason why we should diverge
from GCC's behavior.

No, note the little footnote to the item in the C standard for <limits.h> says that it is an exact value. The normative text is in 6.2.6.2 p1

... The type bool shall have one value bit and (sizeof(bool)*CHAR_BIT)- 1 padding bits. ...

So the problem may even run deeper than just defining the correct value for the macro.

@AaronBallman
Copy link
Collaborator

AaronBallman commented Nov 22, 2024

No, note the little footnote to the item in the C standard for <limits.h> says that it is an exact value. The normative text is in 6.2.6.2 p1

Footnotes are not normative, so that's a bug in the standard we should fix. The intro text is very explicitly contradicting that footnote:

The values given subsequently shall be replaced by constant expressions suitable for use in conditional expression inclusion preprocessing directives. Their implementation-defined values shall be equal or greater to those shown.
— width for an object of type bool
BOOL_WIDTH 1

(Oh, I think I see what you mean, we do say "width for an object of type bool" which could be seen as a descriptor or as a requirement. I read it as a descriptor and you read it as a requirement. Neat!)

... The type bool shall have one value bit and (sizeof(bool)*CHAR_BIT)- 1 padding bits. ...
So the problem may even run deeper than just defining the correct value for the macro.

I think we may be okay here, because we lower bool to an i1 for LLVM: https://godbolt.org/z/7WWxvzhYb (CC @nikic for verification on that)

@AaronBallman
Copy link
Collaborator

Based on comments I spotted at

/// Fields controlling how types are laid out in memory; these may need to
, I think we should hard-code __BOOL_WIDTH__ to 1 in InitPreprocesor.cpp and not touch TargetInfo. The getBoolWidth() function is returning the number of bits in the object representation, not the value representation. But I'm double-checking this reasoning.

@gustedt
Copy link
Author

gustedt commented Nov 22, 2024

No Aaron (to the previous comment). The value could in theory be different for types that have a different width than the minimum width as defined here. For types that have a width that is fixed by the standard there is no wiggle room. So note that the leeway for the implementation is not the value of the macro, the macro is there to represent the effective width of the type.

And it was a conscious decision by WG14 to present it as it is presented.

What is normative is the text that I cited and which fixes the number of value bits to one. (The number of value bits is the definition of the width.)

With which integer type you represent the type internally is llvm's business, obviously, you'd only have to ensure that all other bits than the lowest are considered as padding bits.

@gustedt
Copy link
Author

gustedt commented Nov 22, 2024

Based on comments I spotted at

/// Fields controlling how types are laid out in memory; these may need to

, I think we should hard-code __BOOL_WIDTH__ to 1 in InitPreprocesor.cpp and not touch TargetInfo. The getBoolWidth() function is returning the number of bits in the object representation, not the value representation. But I'm double-checking this reasoning.

The name of getBoolWidh is then very misleading. The width is not the number of bits in the representation. That's what the combination of sizeof (or _SIZE macros) and CHAR_BIT macro is for.

(And also I do not know what you mean by a descriptor. This macro is meant to expand to the number of value bits of the types, which since C23 is 1 in this case.)

@AaronBallman
Copy link
Collaborator

No Aaron (to the previous comment). The value could in theory be different for types that have a different width than the minimum width as defined here. For types that have a width that is fixed by the standard there is no wiggle room. So note that the leeway for the implementation is not the value of the macro, the macro is there to represent the effective width of the type.

I would like the standard to say that more clearly as it is a point of needless confusion. We should not explicitly say something can be implementation-defined to be greater than the values shown when we mean something more specific.

And it was a conscious decision by WG14 to present it as it is presented.

Great, and we can make a decision whether we want to change it now that it's caused problems in practice. :-)

What is normative is the text that I cited and which fixes the number of value bits to one. (The number of value bits is the definition of the width.)

Correct, that's the definition of width as a term of art. But we're talking about _WIDTH macros, which do (clearly) not say that they expand to the width (term of art) of the type.

(And also I do not know what you mean by a descriptor. This macro is meant to expand to the number of value bits of the types, which since C23 is 1 in this case.)

"width for an object of type bool"
"width for an object of type unsigned short int"
"width for an object of type unsigned int"

etc can be read as introductory text; there's no cross-reference or other indication that "width" here means to be connected with the later-defined term of art.

Basically, I think the standards wording would be more clear if 5.2.5.3.2 said something along the lines of "Each standard unsigned integer type (6.2.5) has a corresponding width macro. The implementation-defined values of these macros shall be the number of bits in the value representation of the type (6.2.6.2)." This makes it very explicit that the macros expand to the width (term of art for the number of bits in the value representation) and not the width (ambiguous English term that means different things in different contexts, such as the standard format specifiers talking about field widths, compilers that have poorly named interfaces, etc).

Regardless, this is a Clang bug for more than just GCC compatibility reasons; in that I think we're strongly agreed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c23 clang:frontend Language frontend issues, e.g. anything involving "Sema" confirmed Verified by a second party
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants