feat: egress billing RFC #28

travis · 2024-05-28T23:59:25Z

First writeup of the proposal for egress billing. Looking for feedback on the ideas and on the coherence of the writeup.

Preview

First writeup of the proposal for egress billing. Looking for feedback on the ideas and on the coherence of the writeup.

alanshaw · 2024-05-30T09:12:02Z

rfc/egress-billing.md

+}
+
+async function handle(request){
+  const rateLimited = checkRedisToDetermineIfRateLimited(request)


I would just not reference the specific tool since it's an implementation detail.

Suggested change

const rateLimited = checkRedisToDetermineIfRateLimited(request)

const rateLimited = isRateLimited(request)

ah I hear you but I am trying to convey that only "Redis" should consulted here - I've come to think of "Redis" as an API not a product (because that seems to be the reality of it in 2024) - I do mention earlier that I'm using Redis as an example and I think this is consistent with that, so I'd like to keep it

alanshaw · 2024-05-30T09:12:57Z

rfc/egress-billing.md

+    case No:
+      return { status: 200, body: readContentFromW3up(cid) }
+    case Maybe:
+      return { status: 200, body: readContentFromW3up(cid) }


To me this is clearer since it's saying to do the same thing in either case:

Suggested change

case No:

return { status: 200, body: readContentFromW3up(cid) }

case Maybe:

return { status: 200, body: readContentFromW3up(cid) }

case No:

case Maybe:

return { status: 200, body: readContentFromW3up(cid) }

ah I was worried that people who aren't JS-brained would misread that as "if No, do nothing" - I could definitely be convinced but imho the way I've done it makes it a little clearer that something should be done in each case, and since they are right next to eachother it's pretty easy to verify that they're the same thing...

alanshaw · 2024-05-30T09:14:58Z

rfc/egress-billing.md

+}
+```
+
+[TODO: should I translate this into UCAN.current too?]


I would, as a <details/>

alanshaw · 2024-05-30T09:16:16Z

rfc/egress-billing.md

+    // multihash must match an uploaded blob
+    ["==", ".content", { "/": { "bytes": "mEi...sfKg" } }],
+    // must be available from this url
+    ["==", ".url", "https://asia.w3s.link/ipfs/bafk...7fi"],


I guess this will typically be a URL and a byte range? Like existing location claim: https://github.com/w3s-project/content-claims?tab=readme-ov-file#location-claim

that makes sense - I was cargo culting off of @Gozala so redirect this one over to him for confirmation...

It could be a range yeah, but I think we made it optional if it's not a segment but the whole thing

alanshaw · 2024-05-30T09:18:18Z

rfc/egress-billing.md

+  "sub": "did:web:asia.web3.storage",
+  "pol": [
+    // Request origin header must be example.com
+    ["==", ".headers['origin']", "example.com"],


What is the significance of ['origin'] over .origin here? Below .token is used.

hmm, nothing that I know of - again just copying @Gozala, maybe he was trying to illustrate that either style can be used? I'm happy to use .origin!

alanshaw · 2024-05-30T09:31:02Z

rfc/egress-billing.md

+}
+```
+
+For any given CID, the gateway will be able to look up the CID-specific content commitment as well as the


For any given CID

I think we're talking DAG root CID not any...right?

oh that's a good point, I actually don't know how this will handle CIDs that aren't DAG roots - do those even get content commitments back from the blob upload service? @Gozala any thoughts on this? I'm not sure we accounted for these CIDs in our chats...

^^^ important point. I think there's an intersection where we need to understand how this relates to the indexing RFCs. Fine to just handle DAG root to start but having w3s.link ONLY work for DAG root is very frustrating and I see the indexing work heading in a direction where we're not limited in that way.

alanshaw · 2024-05-30T09:31:58Z

rfc/egress-billing.md

+
+Note that `bafy..access` alone will not be enough to authorize retrieval
+of the content - the accounting system will need to combine it with
+a CID-specific content commitment that has been delegated to `Group`.


I'm not following how this works, I think you need to show the "CID-specific content commitment that has been delegated to Group"

oh that's the first one, bafy..group - I've updated the text to clarify that!

alanshaw · 2024-05-30T09:40:06Z

rfc/egress-billing.md

+first implement a global per-CID rate-limit that is expected to account for N%
+of _R_. This rate limit will be used for all content that has not been configured with another
+rate limit, and for content that has been configured with other rate limits that have already
+been reached.


It would be nice to model this as "just another user that set a rate limt", where "another user" is us. Eventually, some entity will have to pay the egress from the nodes storing the content, so we do need to be able to allocate responsibility for that egress to some entity.

i.e. there's no magic "global" limit, it's just some user (us) that has set a rate limit that happens to apply to ALL content served by the gateway.

yea completely agree!

alanshaw · 2024-05-30T09:42:45Z

rfc/egress-billing.md

+
+We may want to consider using an open source alternative like https://github.com/nalgeon/redka if we
+are concerned about using "source available" products - my inclination is to use Redis but consider products like
+Redka as a fallback in case Redis no longer meets our needs. We may also want to consider using AWS or Cloudflare products with Redis-compatible APIs:


Cloudflare's KV is super fast since it's eventually consistent, which feels like a good fit.

ah cool will add it to the list!

I will say that one of the core requirements here is the ability to implement efficient rate limits - you can see an example of what that could mean here: https://redis.io/glossary/rate-limiting/

I think (but am not sure) that one of the key things is support for the INCR operation, which I don't think KV implements, and expiring entries, which may be better supported?

The good news is that lots of stuff seems to implement the Redis query interface these days, including Cloudflare's Upstash and no less than two different AWS products - I think we'll have options!

alanshaw · 2024-05-30T09:43:45Z

rfc/egress-billing.md

+system. We expect this to result in some amount of "unauthorized" read request service - for example, if two requests for
+the same content arrive at nearly the same time they will always both be served even if the second request
+goes over all established rate limits, since the rate limits will only be updated some time after a request is served. 
+This pattern is similar in some ways to the "stale while revalidate"  pattern that is popular on the web today - stale


"fail open" also comes to mind...

yea love this!

vasco-santos · 2024-05-30T13:07:25Z

rfc/egress-billing.md

+We need to start charging for data served by our gateway ("gateway egress"). Users who upload data using `w3up` should
+be charged when we serve that data through the gateway that is currently hosted at https://w3s.link. Importantly, if 


are we strictly tackling gateway, or do we also plan to bitswap? I think some of the ideas here could be applied there as well, even though with extra layer of indirection to figure out the blob CID where a block CID belongs to

ah I was thinking of the "gateway" as something that is also handling bitswap - imho the gateway and bitswap provider should both work off the same rate limits

probably I would make this extra specific and add it to spec then :) There are things like origin that are actually only specific for HTTP and as far as I know won't be available in bitswap.

oh yea agree, will do!

There are things like origin that are actually only specific for HTTP and as far as I know won't be available in bitswap.

I would expect that libp2p node will be delegated capability explicitly and constraints there may be different from the ones imposed on gateway (described here). I'm not sure what those constraints could be perhaps something like peer id that can perform a request.

hannahhoward

I think I'm approving generally, but if we're going to work on Egress billing and indexing at once, we really need to sync on these two designs. I think we still have some confusion and I wonder if when we move beyond prototype we do the indexing first cause it's really essential to all this.

hannahhoward · 2024-06-04T04:27:13Z

rfc/egress-billing.md

+
+When the gateway receives a request, it will query the [content claims service] for claims about the requested
+CID [TODO: and also other information like origin and token?]. If multiple claims establishing different rate limits
+are found, one is chosen randomly and its request count is incremented, resulting in billing to the 


There's an important clarification I hope I'm reading right.

When are multiple claims found?

If you have to match token and origin, why would you have multiple claims?

Or maybe you have the matching unlimited claim and the general open access rate limited claim? In which case I'd lean towards using the unlimited claim rather than picking at random.

Add a new middleware that checks a rate limiting service and returns a 429 if the CID is over a rate limit. This sketches out an API for the rate limiting and accounting services suggested in storacha/RFC#28 This is not ready to merge, but should probably be the starting point for this work once we all agree that this is the right shape.

feat: egress billing RFC

f0c90b5

First writeup of the proposal for egress billing. Looking for feedback on the ideas and on the coherence of the writeup.

travis marked this pull request as draft May 28, 2024 23:59

travis self-assigned this May 28, 2024

travis requested review from Gozala, alanshaw, joaosa, hannahhoward, vasco-santos, gammazero and prodalex May 28, 2024 23:59

travis added 9 commits May 28, 2024 17:01

fix: typo

09924d4

feat: trim down and clarify abstract

c9c7763

fix: more clarification and editing

6ff056f

feat: add sequence diagram

84dc4f3

feat: another diagram and some clarity about how to set rate limits

74c44eb

feat: even more diagrams!

2da9e12

feat: MORE DIAGRAMS and more cleanup

2d18655

feat: work planning

90c6024

feat: add more options for Redis-compatible services

f3aaa54

alanshaw requested changes May 30, 2024

View reviewed changes

vasco-santos reviewed May 30, 2024

View reviewed changes

travis marked this pull request as ready for review May 30, 2024 16:39

travis mentioned this pull request May 30, 2024

Finalize design for egress metering and billing storacha/project-tracking#56

Closed

feat: clarify group section

6d955e9

hannahhoward approved these changes Jun 4, 2024

View reviewed changes

travis mentioned this pull request Jun 13, 2024

wip: spike on adding rate limits to freeway storacha/freeway#109

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: egress billing RFC #28

feat: egress billing RFC #28

travis commented May 28, 2024 •

edited

Loading

alanshaw May 30, 2024

travis May 30, 2024

alanshaw May 30, 2024

travis May 30, 2024

alanshaw May 30, 2024

alanshaw May 30, 2024

travis May 30, 2024

Gozala Jun 3, 2024

alanshaw May 30, 2024

travis May 30, 2024

alanshaw May 30, 2024

travis May 30, 2024

hannahhoward Jun 4, 2024

alanshaw May 30, 2024

travis May 30, 2024

alanshaw May 30, 2024

travis May 30, 2024

Gozala Jun 3, 2024

alanshaw May 30, 2024

travis May 30, 2024

alanshaw May 30, 2024

travis May 30, 2024

vasco-santos May 30, 2024

travis May 30, 2024

vasco-santos May 31, 2024

travis May 31, 2024

Gozala Jun 3, 2024

hannahhoward left a comment

hannahhoward Jun 4, 2024

	const rateLimited = checkRedisToDetermineIfRateLimited(request)
	const rateLimited = isRateLimited(request)

		We need to start charging for data served by our gateway ("gateway egress"). Users who upload data using `w3up` should
		be charged when we serve that data through the gateway that is currently hosted at https://w3s.link. Importantly, if

feat: egress billing RFC #28

Are you sure you want to change the base?

feat: egress billing RFC #28

Conversation

travis commented May 28, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hannahhoward left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

travis commented May 28, 2024 •

edited

Loading