-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Egress, authorized by UCAN #36
base: main
Are you sure you want to change the base?
Conversation
e8da73f
to
f839d13
Compare
f839d13
to
e3c2839
Compare
rfc/egress-with-ucan.md
Outdated
1. Look up the Location Commitment for the given CID. If not found, respond with 404 Not Found. | ||
2. If there is no token given, serve the request through the standard rate-limiter. (This behavior is likely to change in the future.) | ||
3. If there is a token given (eg, `abcde12345`), build its corresponding DID (eg, `did:bearer:abcde12345`). | ||
4. Look up all delegations with the token DID as audience. | ||
5. Attempt to prove the ability to invoke `/space/content/retrieve` on the Space listed in Location Commitment, with the given CID. If more than one Location Commitment is found, attempt each in turn: a CID may be stored in multiple Spaces, and the token may be able to retrieve the content through one and not another. | ||
6. If no such proof chain is available, respond with 401 Unauthorized. [or 404 Not Found?] | ||
7. If a proof chain is found, execute the invocation on the Executor. | ||
8. Using the information in the receipt, fetch the content and proxy it as the response. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: How will the customer ID be identified to log the egress event in the Accounting Service? At which step in this process will the customer ID be retrieved and associated with the request for accurate billing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The billing is tied to the Space, so once we have the Location Commitment (which specifies the Space), we can bill the right customer. That probably deserves a callout in here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Thanks. Yes, it would be great to have that stated in the RFC as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're missing a step here! we need to look up the "provider" registered with the space, which is a looking in a dynamo table that Freeway does not have access to - I think this needs to be handled either in w3up or w3infra
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for flagging that @travis . I will update the w3infra
to execute that query and find the provider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a HUGE fan of this design, really nice, very excited to get this into prod!
rfc/egress-with-ucan.md
Outdated
} | ||
``` | ||
|
||
The delegation must be available to the Executor at invocation-time. Since the Invoker will be using a token and not speaking UCAN, they will not be able to deliver the proof, so the Executor must have access to it in a store. The Client should therefore invoke `access/delegate` (UCAN 1.0 equivalent TBD) to store the delegation with Storacha. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: love that we can reuse access/delegate
here! how we actually stored these delegations was a question that was making me un-comfy and this is a very elegant solution.
rfc/egress-with-ucan.md
Outdated
|
||
The Space may then delegate this to another Principal to give them authority to access the Space's content. Typically, this will not be done directly (though it may), but indirectly through an Account and an Agent: the Space will delegate all of its capability to an Account, which will delegate all of *its* capability to an Agent when it logs in. Then the Agent (ie, the logged-in customer) can share access to the content as they see fit. | ||
|
||
## A new DID method: `did:bearer` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought: this is so interesting - on one hand, the DID spec says that "the design [of DIDs] enables the controller of a DID to prove control over it" and it feels a little weird to make a DID where this proof is not possible, but that's sort of the idea with bearer token auth in CDNs! nobody can really "prove" control over a bearer token, and that's ok because it's a fairly lightweight form of "security" that can be easily "broken" but in practice is not because the payoff isn't very big. I definitely balked a little at this but upon further reflection I kind of think it's genius?
rfc/egress-with-ucan.md
Outdated
|
||
The Gateway will offer an HTTP endpoint. Currently, the Storacha Gateway's endpoint takes the form of `https://<cid>.ipfs.w3s.link/`. The Gateway will accept a token as part of the URL. To serve the request, the Gateway will perform the following steps: | ||
|
||
1. Look up the Location Commitment for the given CID. If not found, respond with 404 Not Found. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: can we also look up delegations to the did:bearer
in this step? I'm a little concerned that adding another network request will slow down reads too much, but I'm not totally clear on whether location commitments and delegations to bearer tokens are even queryable in a single network request, so maybe this is moot
rfc/egress-with-ucan.md
Outdated
|
||
## [To Come] | ||
|
||
* Rather than serve non-token content rate-limited by default, require a delegation of `/space/content/retrieve` to some DID representing "anyone". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: is did:bearer:
(ie, empty-string bearer token) a valid DID? maybe that or did:bearer:*
would make sense, but implying "glob" semantics is maybe a slippery slope? then again I guess we are free to interpret did:bearer:*
however we want so maybe this is not a big deal...
rfc/egress-with-ucan.md
Outdated
## [To Come] | ||
|
||
* Rather than serve non-token content rate-limited by default, require a delegation of `/space/content/retrieve` to some DID representing "anyone". | ||
* Bitswap should execute a `/space/content/retrieve` as well to respond to requests. Bitswap should be authorized by delegating to some DID representing Bitswap/Hoverboard. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: love this - delegating to a hoverboard DID feels allllmost analogous to "pinning" in IPFS if you squint at it - ie, it's our system saying that we'll make a piece of content available via bitswap to the IPFS network - feels like the right vibe of "we're a storage system compatible with IPFS"
rfc/egress-with-ucan.md
Outdated
## Open questions | ||
|
||
* What does the `/space/content/retrieve` receipt look like? | ||
* Can you enumerate the contents of a space? Is that in scope here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought: we do have upload/list
and space/blob/list
for this - I don't think we need to worry about it here as the gateway doesn't expose any way to do this
rfc/egress-with-ucan.md
Outdated
|
||
* What does the `/space/content/retrieve` receipt look like? | ||
* Can you enumerate the contents of a space? Is that in scope here? | ||
* What does a Gateway URLs with a token look like? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: I'd probably go with something like https://<cid>.ipfs.w3s.link?auth=<token>
- note that the current implementation actually looks in the Authorization
header and will need to be extended to look in the query as well - imho both are important for different use-cases but the query is probably most important for our current efforts
rfc/egress-with-ucan.md
Outdated
* What does the `/space/content/retrieve` receipt look like? | ||
* Can you enumerate the contents of a space? Is that in scope here? | ||
* What does a Gateway URLs with a token look like? | ||
* What can be cached? This seems relatively cacheable, but we should be explicit in the design to make sure we're on a suitable path. This process needs to be *fast*, at least once the cache is warm. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought: once we've validated a token I think we can store something in KV or a similarly performant caching layer, probably with an expiry, that says it's valid. we might need a mechanism to invalidate this cache, but that can probably wait until after the initial implementation?
rfc/egress-with-ucan.md
Outdated
> * **Subject:** The Space from which content will be retrieved. | ||
> * **Arguments *(0.9: `nb`)*:** | ||
> * `cid`: The CID of the content which will be retrieved. | ||
> * **Receipt:** [TBD, but must provide instructions to access the data (using HTTP?) without further UCAN authorization.] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the user set the bearer token as caveats to this delegation?
rfc/egress-with-ucan.md
Outdated
"with": "did:key:zSpace", | ||
"can": "space/content/retrieve", | ||
"nb": { | ||
"cid": "bafy...7pcu" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps name it "root" to differentiate this as a DAG root CID, also to be consistent with our existing upload/add
invocation.
Also, this will actually be a link not a string, so, assuming dag-json
encoding we should specify as:
"cid": "bafy...7pcu" | |
"root": { "/": "bafy...7pcu" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we do want that to be a link, actually. We want to request the content using the CID as an argument; this would provide the content itself (by reference) as the argument. Importantly, I believe, policies would resolve against the resolved bafy...7pcu
, not the string "bafy...7pcu"
. We want to match on the string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no obligation to include the content in the invocation - we don't for upload/add for example.
I'm almost certain you can match a link in a policy...
IDK maybe I'm not understanding right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spec doesn't appear to say explicitly, but it does say:
Selecting on Bytes
Bytes MAY be selected into. When doing so, they MUST be treated as a byte array (
[u8]
), and MUST NOT be treated as a Base64 string or any other representation.// DAG-JSON { "/": { "bytes": "1qnBjPjE" } } // Hexadecimal 0xd6 0xa9 0xc1 0x8c 0xf8 0xc4 // Selector ".[3]" // ⬆️ 0x8c = 140
If the policy resolver understands DAG-JSON bytes, I assume it understands links as well. I'm not sure what the correct behavior would be when matching on a link, but I don't think it would be to treat it as a string, or as a literal JSON map { "/": "bafy...foo" }
. My assumption would be that, if anything, it would attempt to inline the link before applying the policy, and perhaps bail if it didn't have access to the content.
rfc/egress-with-ucan.md
Outdated
|
||
The Space may then delegate this to another Principal to give them authority to access the Space's content. Typically, this will not be done directly (though it may), but indirectly through an Account and an Agent: the Space will delegate all of its capability to an Account, which will delegate all of *its* capability to an Agent when it logs in. Then the Agent (ie, the logged-in customer) can share access to the content as they see fit. | ||
|
||
## A new DID method: `did:bearer` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like this idea, but that said, I'm not super keen on inventing another new DID method...
To play devils advocate, why is this better than delegating, say space/content/serve
to the gateway DID with the bearer token that allows retrieval in the caveats?
One potential thing I can think of is when it comes to proving you served the data - without a specific delegation to the entity that claims to have served some data, there's scope for fraud if the entity can somehow get hold of the delegation to did:bearer
, because it could then claim it served a billion petabytes of data (for example). It gives these tokens real value, and means that anyone holding them is perhaps more of an attack target.
Another potential issue I think is by delegating to did:bearer
you're effectively allowing access on all/any gateways. I wonder if there may be a need to restrict to specific gateways in the future?
Delegating to the gateway makes it easier to expand this to bitswap for example - you could delegate to the DID of our bitswap peer. You can also easily delegate to just one, or both, or neither.
Delegating to did:bearer does not allow re-delegation. I'm not sure if that's necessary/desired, but I imagine you might want to re-delegate space/content/serve
.
rfc/egress-with-ucan.md
Outdated
} | ||
``` | ||
|
||
The delegation must be available to the Executor at invocation-time. Since the Invoker will be using a token and not speaking UCAN, they will not be able to deliver the proof, so the Executor must have access to it in a store. The Client should therefore invoke `access/delegate` (UCAN 1.0 equivalent TBD) to store the delegation with Storacha. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something makes me uneasy that the gateway is using UCAN delegations for which it is not the audience.
|
||
To make content available as on a traditional CDN, an Agent acts on authority of a Space to make that Space's contents, or some specific CID in it, available using a bearer token, an opaque, unguessable string. The Gateway then responds to requests which contain a token by validating the proof chain, finding the content, charging for egress, and proxying it to the requester. | ||
|
||
This process does *not* include making and tracking Location Commitments. A Location Commitment is an attestation by a Storage Node that it holds a particular piece of content on behalf of a particular Space, and that it can provide it. In this proposal, we assume such a system already exists. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One interesting thing to consider is that if the location commitment were to be cited in a space/content/retrieve
delegation, in the happy path (where the content has not been moved), the gateway would already know where to fetch the content from, simply by reading the delegation that authorizes retrieval.
Probably not worthwhile but just thought I'd write it down :)
rfc/egress-with-ucan.md
Outdated
|
||
## [To Come] | ||
|
||
* Rather than serve non-token content rate-limited by default, require a delegation of `/space/content/retrieve` to some DID representing "anyone". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I guess did:*
/null
audience?
Alternatively, with a space/content/serve
delegation you might delegate to the gateway and specify an empty bearer token.
|
More on the last point: per @hannahhoward & @prodalex, we'll only support permissions by entire Space for the first go. That should mean we can skip over a lot of these questions for now. We're actually going to be doing this in UCAN 0.9, not 1.0, so we don't have proper policies, just the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to leave a general comment before digging into specifics, that vageuly has to do with @alanshaw 's question about the DID.
So, I want to state my own thinking that there are two distinct and separate protocols for a retrieval:
- The raw read from a source location (simple HTTP GET with range request)
- The gateway query across potentially one or more providers to assemble a user level response, likely deserialized to a flat file. The second protocol uses the first protocol + the indexing service to accomplish its work.
Furthermore, the first protocol is close to a typical object storage interface, while the second, especially when run on a platform like Cloudflare, is much closer to a CDN (or a CDN + edge compute). For the remainder of this I'll use "Object storage layer" to refer to the first and "CDN layer" to refer to second.
There are some other important distinctions -- the object storage layer protocol in concert with the indexing service can be run without a trust relationship between the retriever and the storage node, while the CDN layer requires a trust relationship between the end user and the gateway, unless you're specifically staying in the bounds of the trustless portion of the Gateway protocol spec. (once you serialize to a flat file, you're now trusting the gateway)
Up till now, the object storage protocol hasn't even really existed a separate protocol cause we're just reading our own databases and our own storage devices, but going forward it gets increasingly public. With the indexing service, in combination with the PDP storage nodes, it becomes a distinct layer (the november version will not be fully there but it will get there soone).
I apologize not for making this super clear as part of my own thinking. I see these also as different economic units of billing for of egress eventually -- the user pays x
for a gateway retrieval, of which y
goes to the gateway, and z
goes to the storage node, where y + z = x
. None of this needs to be figured out right this second cause we're running everything and our PDP nodes will probably just have an indirect accounting mechanism, but in a final product, the storage node would never get 100% of the user egress fee because we incur a sizable cost for running the cloudflare gateway that assembles the request, caches it, and services it super quick to the user.
When we talk about storage providers serving directly to end users, I generally don't see them running a gateway. Rather, I see a user with a native or server app who doesn't prioritize TTFB choosing to save money by running the freeway software directly, and only using the indexing service + object storage layer to get data.
Another use case could be someone building a custom retrieval gateway on top of the indexing service and object storage layer. Perhaps a product builder wants to build a query interface for archived blockchain data, and sell that independently. They would be storing on storacha, but charging their users to use their retrieval service that could build more complex IPLD queries against a blockchain than is available with the gateway protocol. (or perhaps just mirror the establish RPC api for querying for their chosen blockchain).
Anyway, coming back to this PR, I want to understand if space/content/retrieve
refers the CDN layer of retrieval, or the object storage layer. A couple things that make this a bit confusing:
- There is no invoker for the CDN layer of the retrieval (i.e. it's not a proper UCAN request), at which the gateway is the executor.
- There's no proper executor for the storage layer of the of the retrieval for now, until the storage nodes do UCAN auth.
The RFC simply says the gateway is the invoker and the executor and doesn't make clear the part of the retrieval we're referring to.
I think ultimately space/content/retrieve
makes sense as the CDN layer, and it's a weird one cause the invoker and executor are ALWAYS the gateway. That means the space delegates to the gateway in order for it to execute, and the token goes in the policy (caveats for now).
So long and short I agree with @alanshaw -- no did:bearer, because it's not a real entity. If we build a gateway that is called with a real invocations, then it makes sense that the issuer becomes a real, verifiable entity with tracable delegation to the space, the gateway is just an executor. There are certainly use cases for that, but not worth worrying about for then.
What about the storage layer retrieval in the future? So my suggestion is to actually make space/index/query
and space/content/retrieve/blob
for that, eventually. So when the CDN receives a request, it invokes space/content/retrieve
with itself as the issue + executor, using delegations it has stored. And it generates a receipt that looks something like:
{
cmd: "/ucan/assert",
args: {
about: "bafy...cid" // space/content/retrieve invocation
facts: {
out: {
ok: {
statusCode: 200,
bytesServed: 1,000,000
}
},
run: [
{
cmd: "space/index/query"
},
{
cmd: "space/content/retrieve/blob"
},
{
cmd: "space/content/retrieve/blob"
}
]
}
}
}
This wouldn't be sent back to the retriever but this provides nice instructions on how to bill the original user, that can be checked against the storage node and the indexer submitting receipts of their own for their parts of retrieval (along with the invocation that proves a retrieval/index was actually requested by the gateway). It all vaguely works for a trustless billing system :) (where even the indexing service could get paid)
That's my take.
But yea, I'm a block on not doing did:bearer
after further consideration. Is there any reason we can't just throw it in caveats? We don't have policies but we have ucanto that can enforce this no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
previous review ultimate requests a change to remove did:bearer
and put it in caveats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok cool - after reading through all this I'm switching to "request changes" - sounds like we're all on the same page anyway!
Just for clarity, in my comments above I suggested delegating I propose Example of capability derivation in ucanto. 😱 I don't recall this being in the UCAN spec so perhaps explicit delegation is what we will do... Separating the capabilities like this allows us to refer separately to the 2 protocols @hannahhoward called out above. @Peeja on origin/path etc. I'd omit these unless we're ready to implement them. |
agree re: origin! just want to make sure there's space in the interim protocol if we do decide to go that route, but I'm honestly hoping we can upgrade to UCAN 1.0 before we need that, where it will be a fairly easy and natural extension thanks to |
I hadn't considered that the Gateway would be composing multiple pieces of content across potentially multiple providers to assemble a single response. Given that, I agree wholeheartedly with @alanshaw: that command should be Question: What level does Bitswap operate on? Does it serve requests, or fetch blobs? Question: What are the proper names for these things? Is a "Blob" specifically the same as a "Shard"? (This is non-obvious to someone with less context, as typically a "blob" is simply any set of bytes, and in Git in particular it generally means roughly "a file's contents, separate from any filename that might point to it".) What is the name for the thing that the CDN Layer (Gateway) serves, and what is the name for the thing that the Object Layer serves? |
|
Yeah that's exactly right. |
a672e64
to
a1981ef
Compare
a1981ef
to
cddd71b
Compare
Version 2 is ready for review. Open questions at the end of the doc, and also copied here: Open questions
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM after method rename.
removing request changes to expidite the process when done
2. If none are found, respond with `404 Not Found`. | ||
3. Note if any Location Commitments were found with no Spaces. (If so, these are from before these changes, and mean we should fall back to the previous behavior later.) | ||
4. Get the set of unique Spaces from those Location Commitments. (There will usually be one Location Commitment, with one Space.) | ||
5. Repeat with each Space in any order, stopping if a successful response is produced: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this going to potentially be slow? Is it a separate network request for each space we need to check? The performance is the only part of this spec I have concerns about, but I might be misunderstanding - has been a week or so since my head's been here so apologies if I'm missing something!
3. Note if any Location Commitments were found with no Spaces. (If so, these are from before these changes, and mean we should fall back to the previous behavior later.) | ||
4. Get the set of unique Spaces from those Location Commitments. (There will usually be one Location Commitment, with one Space.) | ||
5. Repeat with each Space in any order, stopping if a successful response is produced: | ||
1. Look up delegations in the store where the audience is `did:web:w3s.link` and the subject is the Space. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess relatedly - is "the store" here the indexing service or something else?
📚 Preview
A proposal for managing billable egress using UCAN, distilled from a conversation between @hannahhoward, @travis, and myself (@Peeja) on 10/04/24. Builds on and evolves prior egress proposals.
Open questions and missing bits are currently in the document, at the bottom.