Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Go: Fix missing promoted fields due to name clash #18001

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

owen-mc
Copy link
Contributor

@owen-mc owen-mc commented Nov 17, 2024

When two embedded fields at different depths have the same name (but are not the same type - they are in different packages and happen to have the same name) then we don't treat the second one correctly and we don't promote any of its fields or methods. This turns out to be because of a condition that was added to avoid non-termination on cyclic structs in github/codeql-go#184 which only checked the name of the embedded field. The fix is to also check the type.

While looking into this I noticed that we have two different predicates for calculating field candidates. This PR includes a refactoring to remove the redundancy and make the code easier to understand.

@owen-mc
Copy link
Contributor Author

owen-mc commented Nov 18, 2024

Hmm, the tests pass locally for me. I'll have to look into that more.

@owen-mc owen-mc force-pushed the go/fix/missing-promoted-fields branch 2 times, most recently from 2547c83 to 4a05c3d Compare November 22, 2024 00:17
@owen-mc
Copy link
Contributor Author

owen-mc commented Nov 22, 2024

I figured it out. I was using the released CLI, but #17941 changes the extractor and hence the test results, so I needed to build and use the go extractor locally.

This is ready to review now.

smowton
smowton previously approved these changes Nov 22, 2024
Copy link
Contributor

@smowton smowton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't reviewed the tests in detail, but I'm 99% convinced the elimination of getFieldCand vs hasFieldCand is sound. One difference jumps out: getFieldCand uses hasEmbeddedField which doesn't look through a NamedType before using .(PointerType).getBaseType(), getFieldOfEmbedded does. This might make a difference if X is embedded where X is defined type X *Y, in a positive direction. I note hasMethodCand uses this embedded-field predicate as well.

Definitely needs DCA.

@owen-mc
Copy link
Contributor Author

owen-mc commented Nov 25, 2024

Note that getFieldCand uses hasEmbeddedField, which is defined using hasFieldCand. So it seems independent, but actually it isn't.

@smowton
Copy link
Contributor

smowton commented Nov 26, 2024

That's true, but for one layer we'll miss named pointer types AFAICT:

    exists(Field f | this.hasFieldCand(_, f, depth, true) |
      tp = f.getType() or
      tp = f.getType().(PointerType).getBaseType()
    )

That means getFieldCand -> hasEmbeddedField will miss the case where there's an embed of type Named which is defined as *MyEmbeddedStruct. For deeper embeddings it then uses hasFieldCand -> getFieldOfEmbedded, so this will only affect depth-1 embeddings.

@owen-mc
Copy link
Contributor Author

owen-mc commented Nov 26, 2024

I wrote a test for what you're describing.

type BazPointer *Baz;

type EmbedsBazPointer struct {
	BazPointer
}

It gave me a syntax error: "embedded field type cannot be a pointercompiler" and this link: InvalidPtrEmbed. The relevant part of the spec says:

An embedded field must be specified as a type name T or as a pointer to a non-interface type name *T, and T itself may not be a pointer type. The unqualified type name acts as the field name.

So I think it's okay that the new code-path doesn't exactly do what the old one did, because the missing case is for invalid code.

@smowton
Copy link
Contributor

smowton commented Nov 26, 2024

OK great. That actually suggests then to simplify getFieldOfEmbedded since that does a getUnderlyingType jump prior to checking for a pointer type that in fact can't happen.

@owen-mc
Copy link
Contributor Author

owen-mc commented Nov 26, 2024

I ran DCA before the simplification and it was fine. I've started another one. Clearly that code had been inefficient before, because it had a pragma. Hopefully with the simplification which means it can use lookThroughPointerType it will be optimized well without any guidance this time.

@owen-mc
Copy link
Contributor Author

owen-mc commented Nov 26, 2024

The second DCA seems to show a 2% speed up on the two biggest dbs. I should probably run QA in case there are any pathological cases.

NCField should be promoted to EmbedsNameClash. Currently it isn't
because its embedded parent pkg2.NameClash is not a promoted field in
EmbedsNameClash (because of a name clash with pkg1.NameClash), but this
should not make a difference.
If `T` is the type of an embedded field, it is invalid for `T` to be a
named type defined to be a pointer type (`type T *S`). It is also
invalid for `T` to be a type parameter. So this `getUnderlyingType()` is
redundant.
@owen-mc owen-mc force-pushed the go/fix/missing-promoted-fields branch from 704e0f7 to 0e94ee8 Compare November 26, 2024 22:26
@owen-mc
Copy link
Contributor Author

owen-mc commented Nov 27, 2024

I messed up the QA run by using a badly chosen nightly base, so the results are dominated by the results for #17494. However, I can see that there aren't any alert changes in queries which aren't affected by that PR, and there aren't any noticeable slowdowns. So I am okay with merging this now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants