-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Standardizing CI OIDC token claims #754
Comments
Another useful data point, Tekton (which is also drives other tools like JenkinsX), which would be limited by the Kubernetes JWT workload claims. (I assume this would also affect prow and other Kubernetes based CI as well) |
I don't think that's correct. In your link |
I'm actually on the side that it should not: these values can easily be added inside a signed attestation -- this is very much like recreating provenance inside a signing certificate. See slsa-framework/slsa#464 related issue. The signing cert contains enough builder information that "You could think of the x509 builder as a first-stage builder, which is limited but sets the "root of trust". " We definitely don't NEED to include all the provenance inside the certificate. @laurentsimon
Again, I think this starts turning into provenance metadata. At minimum the signing cert should contain just the necessary info to identify the workflow: including the caller and called workflow and its commit SHA.
BIG +1! GitHub does expose repository IDs. Although: think of the verification side: it is much harder for humans to verify the cert fields to see if a signature came from a repository, when it is a repository ID. Again, that can go in provenance info (EDIT: maybe not? since the repository might be an unutrusted resurrected one) |
Thanks for noting this, I've removed this from the issue description so it's now required.
I am in agreement, I believe Laurent as well from the discussion.
I think we should be building verification policies around IDs and not human-readable values. I do agree it's harder for a human to validate it though, but I think this can be solved with better UX. |
Related to the The
Whereas the
This name isn't guaranteed to be unique across the workflows defined in a particular repo so this isn't particularly useful in identifying the calling workflow. |
Maybe a broader question is whether you see Sigstore as a building block / "root of trust" for "richer" systems (re-usable workflows and similar systems) or not. If you consider Sigstore the fundamental building block / enabler / RoT, then you may not need to keep adding more fields in the OIDC token / cert. +1 on agreeing on the set of minimum claims, like workflow(s), ref(s) and repo. I don't know enough about other non-GitHub CIs to say anything about actors and other pieces. On GitHub it's part of the GH context and can be retrieved by the "richer" builder, so is not necessarily needed. Not sure about other CIs If you consider Sigstore a standalone solution for sighing only (without richer trusted builders), then additional fields may be something to consider. Maybe it's a product decision...? Certificates are not the most human-friendly to work with, so it may limit the usability of a solution; whereas adding fields into a JSON-formatted provenance seems easier? Trusted builders can more easily incorporate changes over time (e.g., additional fields, new features), so maybe something to keep in mind as well. |
At this point, I think we're agreement on the following:
QuestionsI'd bet many of these have easy answers and I just don't have enough context/background to know them.
|
+1 on having them, and asking GH to support it including for re-usable workflows if it's not available yet.
self-hosted runners have access to OIDC. You need a round-trip to verify this unless it's added into OIDC token (we asked GH to do that, so it may happen in the future). One additional complexity is that it's possible for a workflow to declare jobs self-hosted and others not.
I don't entirely follow. At least in our case, the build and the provenance generation are separate jobs. The format remains the same, and only the Maybe you're proposing having a dedicated project for provenance generation only? We kinda of have this in the generator repo. We don't expose it and only use it internally, though. We could, in theory, expose it thru a GitHub action. Let me know if I mis-understood the comment.
I think the plan is to share the provenance generation code with other builders for a given CI. On GitHub, we could theoretically create an Action for this. /cc @ianlewis |
TY! That helps.
Let's move this conversation over to slsa-framework/slsa-github-generator#763; apologies for the distraction from the root issue in this thread 😄 |
I don't know, but I'll try and track down the team here responsible for this stuff and make some inquiries.
Yeah, there's a |
I'd like to jumpstart some movement on this issue if possible, as we're regarding it pretty important for our work on npm attestations, especially now that we have begun to reach out to some potential launch partners (read: cloud CI vendors with existing OIDC support) to talk about integration on their own platforms. Additionally we have some commitments from the Actions team to extend the OIDC token with the types of fields discussed in this thread (though we may need to get some further alignment there). If we can get crisp on some non-GitHub nomenclature for the cert fields, I feel like we're a long way toward settling this. Is anyone taking a stab at some generic naming notions? Should we try to chat in Sigstore Slack about a plan for settling this into a PR? |
Let's get a chat going either on Slack or here, there hasn't been any progress. |
Chiming in to describe some updates after we've had some conversations. I think some of this echos @znewman01 discussion earlier. We MUST have the certificate to identify (with immutable references) the smallest "trust domain" relevant for client verification. So for GitHub we MUST have:
Stuff I think we can punt:
Stuff I'm not sure of:
If we do something like we MUST have the reusable workflow immutable ref AND the caller immutable ref, then this lines up with the patter for BuildKite #890 where the reusable workflow is the job_id and the caller immutable ref becomes the organization/pipeline slug. @sj26 |
@asraa the GH Actions team have just added some new claims to the ID token:
This makes sense for trusted builders, which is the north star. I wanted to raise a use-case for npm where it might take a very long time for us to effectively roll out trusted builders in the npm ecosystem given the varied nature of publish workflows in the wild. The majority of existing automated npm publish workflows I've investigated would be hard to support for a trusted builder without a lot of different runtimes and config options. Until we get to a place where most projects end up using trusted builders, we could definitely use more information in the Fulcio cert to be able to validate that key pieces of the provenance statement have not been falsified. This might be a bit of a anti-pattern given the preference for trusted builders to solve this problem. But if we had the repo URL, commit SHA, triggering workflow path, SHA and/or re-usable workflow path, SHA we could compare these values in the Fulcio cert against what's in the provenance statement before accepting the package for publishing. Ideally we could access the following GitHub OIDC claims in the Fulcio cert:
Another thought, would it make sense to adopt "SLSA" naming for these attributes in the signing cert?
It might seem redundant to include |
This information should always be present for both the job_workflow and the triggering workflow, even if the caller refers to it by tag / branch. The GitHub context (not OIDC) provides this information for the repository. It's as important that the OIDC token provide this for the workflow / builder as well: sha, ref, ref_type should always be present.
Let's think carefully about making the OIDC format dependent on the SLSA (evolving) specs. In v1.0, for example, |
I have another claim to suggest, unrelated to the (great) conversation above about build instructions and references. Some CI/CD providers allow you to either run your build on their cloud-hosted infrastructure, or let the customer host their own runner infrastructure. In the npm registry, we want to differentiate between builds that ran on cloud-hosted or customer-hosted infrastructure. We think it makes sense to include this claim alongside the other information being securely communicated from the CI/CD system to Fulcio (and then downstream to npm and other package managers). |
runner information makes sense to include, I think, since it's part of the |
Hiya! I'm Sam from Buildkite. We're introducing OIDC tokens, and I'm keen to see if we can enable usage of cosign for signing and verifying provenance of containers produced by CI/CD builds. We include these already:
We include some equivalents to these:
We do not include these:
We add these, which I think are important:
The In terms of things a user might like to verify, I expect the most would be the pipeline (or workflow) which produced an image, and the source branch or tag (ref) which was used. These feel like good generic attributes. "Git Ref" and "Git Commit" for example could be good generic names for the current GitHub attributes "sha" and "ref". "Git Repository" also feels like a good generic attribute, although I would suggest it be a URI to be useful across CI providers instead of a simple I don't know a good generic name for pipelines or workflows, the container of many runs of a particular ci/cd workflow, but it's closest to There is no standard for CI/CD provider OIDC tokens to my knowledge, and I'm not aware of any drive to standardise at the moment. The domain models vary significantly, too. I suspect normalizing the claims into useful attributes for verifying will need to remain in this fulcio for now. But perhaps there are some common attributes which will emerge and influence the claims generated in future, like GitHub's tokens. If I had to pick a set of common attributes which would be useful in sigstore right now, it'd be roughly:
That's a whole bunch of thoughts and opinions, I'm not sure how much of it is useful, but hopefully a bit 🙏 |
Do you have an example of what this looks like? I'm curious why you need the build inputs to be part of the token. If your builder can be identified using
You mean the source of the builder, not the source being built, correct? Do you have a link / example?
I would add Git Ref Type, which indicates if the ref is a branch, tag, etc. It may be useful to pack these fields into its own struct / field / x509 cert, and version it to allow for flexibility, like:
Fyi, I took a brief look at the SPIFFE ID (https://spiffe.io/docs/latest/spiffe-about/spiffe-concepts/), and they don't seem to have much more than whats proposed here (it's just |
It's quite symbolic. That's because some consumers of OIDC tokens, i.e. AWS, only allow writing policies based on partial string matches against subjects. It also does not uniquely identify a piece of work. It's not ideal, but it's what is available.
The builder could create and store attestations, but most consumers of tokens want to make decisions without round trips back to the builder. And then how does one authenticate back the builder to ask for attestations? If you have a living identity token then maybe it makes sense to use that, but in a signature that token is gone. Again, in the conversations I've been having, most folks want enough information baked into the tokens and/or signatures to make policy decisions without additional round trips or more external systems involved.
Hm, yes I think so. Presuming a "builder" means the same thing as a "pipeline" to both Buildkite and CodePipeline, a builder is generally configured with inputs for new builds, and one of those inputs is the source repository. But the source repository can be changed between builds. So the source repository for two builds run by the same builder may not be the same.
If Git Ref is fully qualified this is already included, no? i.e.
Yeah, interesting. So that's almost pure identity without attributes, unless you understand the URI format for a particular trust domain. More complex policy decisions would need to start from the identity and consult other systems for more context. If you have control over the shape of the SPIFFE ID then it might be easier, but when using a hosted service where the format is dictated then its idea of the trust domain might vary from your own. For example, some workloads might care about branch, but some might not. OIDC seem powerful in contrast because complex policy decisions can be made based on the identity token and the contained attributes including provenance information without requiring additional interactions. And these can be varied by consumers without much support from providers (including providers of hosted services). So I guess it depends on what degree of provenance information sigstore would like to include for policy decisions without additional system dependencies or provider control. |
So there's sort of two questions rolled into one here:
It's critical that Fulcio work across many cloud CI/CD providers. But I'm not sure if we'll be able to get all providers to use the same field names. So to help answer question 1, I'm going to reference the current GitHub OIDC token field names for illustrative purposes, even though the goal of this issue (as I understand it!) is to ensure each cloud CI/CD provider is sending Fulcio the information in some field, which may (or may not) have a different name. At any rate, here's an attempt to summarize where we're at so far. First there's some standard OIDC fields:
Then we get into the build attestations:
|
Proposal for standardizing Fulcio's Certificate Extensions to align with the discussion on [standardizing OIDC token claims](sigstore#754) across CI/CD systems (today GitHub Actions, in future Circle, GitLab, Buildkite etc). The aim here is to find new platform agnostic extensions (as the current ones are GitHub specific) that make sense across the different CI/CD providers that we'd like to see supported in Fulcio. I've preemptively moved the existing oid info doc to a deprecated version with little thought into how this transition should happen in practice. This should probably be scoped in a lot more detail. Signed-off-by: Philip Harrison <[email protected]>
👋 I opened a draft PR: #945 - attempting to standardize on the Fulcio cert extensions where these claims would end up. Let me know if this would be better suited in a new issue before starting on a PR but seemed easier to collaborate on an actual file. |
I would love to see standardized claims in CI provider tokens, but is it reasonable for us to expect providers to actually try to become conformant with a standard created here? As an example, many CI providers failed to correctly implement the What would incentivize these various platforms to be compliant? What have they gained for their users if they do? |
The incentive is ease of integration, and a template for a minimum set of claims to represent an identity. We've had many discussions across issues in this repo about what represents an identity vs what represents provenance. Standardizing on a set of claims makes it clear what we consider to be an identity. Additionally, if a CI provider wants to integrate with Fulcio and has implemented the set of claims, it'll be easy not just for the Fulcio integration in terms of the code that needs to be added, but also for all of the clients that need to verify sigstore-issued certificates. If every CI has its own set of claims/OIDs, it'll be difficult to write verification policies across sigstore clients. |
My thinking with #945 was to standardise on the Fulcio cert extensions that cover the identity. This would effectively standardise on a subset of required OIDC claims, but at the same time not require CI/CD providers to conform to the same claim attribute names. CI/CD specific mapping would still need to exist in Fulcio. |
@feelepxyz +1 to standardizing the certificate extensions over the actual token claims. I feel like its quite a bit easier for providers to marshal / parse existing token claims into the right cert extensions with a small amount of logic in Fulcio itself instead of requiring changes to their token format |
I don't think it's likely CI provides (us included) will change OIDC attributes to suit Sigstore, sorry. Those tokens have too many requirements on them already. But I reckon we'll be happy to provide the grunt to glue them together within sigstore/fulcio. I'm pretty excited that #890 is close to merge. Beyond identifying which pipeline a binary comes from, we have customers asking for the ability to verify which git branch and commit a signed binary comes from, and which build and job (the specific run of a workflow) created a binary too. For example, being able to verify that a binary was produced by an earlier job in the same build, or using the job identity to seek domain-specific attestations via an api. Very keen to see some generalised attributes added. I'm happy to write the plumbing for Buildkite once a direction has been decided. #945 looks pretty promising! |
Reopening during implementation. Im starting implementation on this now. |
We've got a certificate!
Which expands to:
Please double check the values match up to what's expected. Something to note is that the value for each new extension is now in line with what RFC5280 requires, a DER encoded string rather than the raw value[1]. This should hopefully mean that off-the-shelf certificate parsing libraries will have an easier time handling custom extensions. Just cleaning up the code now and then I'll push up a PR with the changes. [1] This was never brought up by the Golang clients because it's so easy to get the value of a custom certificate extension. The DER encoding adds two bytes, a tag for type (0x0C, meaning a UTF8String) and the length of the value. This change means clients will have to unmarshal the extension now. For Go, this looks like:
Very easy still! Now we get the added benefit of being able to specify non-string extension values too. |
@haydentherapper awesome! Thanks for taking this on 😍
Looks like the
Maybe just some rendering weirdness but what's up with the value showing up to the left of the period? Also, presuming the prefixes showing up in the above example are part of the encoding somehow? e.g. Everything else looks good to me! |
Is this encoding the the issuer as DER encoded string? Nit, but should the re-encoded issuer come before |
Nice one 👍 |
Good catch, fixed!
+1 to what Brian said. For example, for |
Yea, I can make that change to move this to |
Goal
Create a standard set of claims that should be present in OIDC tokens from CI systems such as GitHub Actions, Cirrus CI, GitLab, Circle CI, etc.
Background
As noted in the NPM RFC for integrating with Sigstore, and as documented in other tickets (#243, #591, #748), there is interest in support for other CI systems. It is technically possible to implement support for each, but it will require code duplication and work for onboarding every CI platform. It would be ideal if all OIDC tokens from all CI systems had a standard set of claims to represent identity, so that onboarding would simply be updating configuration.
Current state
All of the above platforms either are working on or currently produce OIDC tokens for CI workflows. Fulcio currently only accepts CI tokens from GitHub Actions, and has hardcoded the GitHub specific claim values and produces a code signing certificate with GitHub specific OID values.
Currently expected claims (GitHub ref)
job_workflow_ref
sha
event_name
repository
workflow
ref
aud
(which must be set tosigstore
)exp
sha
,event_name
,repository
,workflow
, andref
are included in issued certificates in custom OIDs - https://github.com/sigstore/fulcio/blob/main/docs/oid-info.md.Required claims
The token should include standard OIDC claims like:
aud
(which must be customizable and set tosigstore
)sub
iss
exp
iat
nbf
We should include the claims specified in "Currently expected claims".
There was conversation in #624 about including the run ID (
run_id
), run count (run_number
) and attempt count (run_attempt
). We should decide if these should be required for Fulcio certificates.Another useful claim may be
actor
, who triggered the CI run.Any claim values must be immutable. For example, user IDs should be used instead of usernames, and repository IDs should be used instead of repository names, to prevent resurrection attacks.
cc @asraa @laurentsimon @znewman01 @fkorotkov @feelepxyz, what would you like to see in a token and do you have recommendations on claim names?
The text was updated successfully, but these errors were encountered: