Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPIP-327: Reframe over HTTP version=2 (DAG-CBOR and better cache controls) #327

Closed
wants to merge 7 commits into from

Conversation

lidel
Copy link
Member

@lidel lidel commented Sep 29, 2022

Context

HTTP Caching problems

  • single endpoint makes it hard to scale (have different methods handled by different services)
  • Etag generation requires buffering response, effectively breaking streaming

On missing CBOR

We've b een using Reframe in Kubo for a while and it is clear that Reframe messages are not designed to be created or read by humans.

(My subjective opinion) The plaintext DAG-JSON representation of messages do not really bring much to the table, because both CIDs and Multiaddrs are in a format that needs manual encoding/decoding anyway, but are still useful during testing ad debugging.

JSON is just bad production format – we are missing CBOR.

Why CBOR?

We already support DAG-JSON via a dedicated content type, and our stack uses CBOR in more places than JSON.

Size savings

Some numbers in https://arxiv.org/abs/2201.03051:

2022-10-10_20-31

While companies are free to pay an unnecessary ~20% penalty, we should not create protocols which expect end users to do the same.

Proposed improvements

(1) Add CBOR format

The improvement proposed here is to add support for requests and responses sent as DAG-CBOR,
with own content type: application/vnd.ipfs.rpc+dag-cbor.

This way Reframe over HTTP would support both DAG-JSON and DAG-CBOR, and it is up to the client to decide what wire format makes sense to them.

(2) improve HTTP caching

While at it, we've improved the way HTTP caching works, removed the requirement of Etag which caused streaming issues, and added method names to URL paths, to improve scaling / routing on reverse proxies / CDNs.

For details, see changes made to reframe/REFRAME_HTTP_TRANSPORT.md

cc @guseggert @petar @willscott

@lidel lidel self-assigned this Sep 29, 2022
@lidel lidel requested a review from a team as a code owner September 29, 2022 13:39
@willscott
Copy link
Contributor

  • do we want to try to support the /reframe/{dag-cbor-encoded-query} mode for cache-able queries at the same time?
  • content type signalling for response encoding as proposed here seems very reasonable.

@lidel
Copy link
Member Author

lidel commented Sep 29, 2022

Yes, the idea is that the server will support both JSON and CBOR endpoints.
Modern clients will most likely only send CBOR, leaving JSON for testing and lazy ad-hoc debug.

I went with separate /reframe/{dag-cbor-encoded-query} for CBOR on purpose, because it removes ambiguity and all hidden footguns related to content-type and query parameters 🙀

We could go with the same /reframe?q={dag-cbor-encoded-query}, but you start hitting real world problems:

  • HTTP CDNs and proxies often ignore headers or query parameters in their cache keys
    • means things do not get cached properly on default turn-key settings and people running indexers need to do additional work
  • The server has to read Content-Type from request to know how to parse it
    • this means it won't be possible to paste CBOR link in browser or do quick curl test (no explicit header), and get human-readable JSON in response

Keeping them separate makes these problems go away, and HTTP caching of CBOR "just works" even when headers are ignored.

If a server supports HTTP/1.1, then it MAY send chunked-encoded messages. Clients supporting HTTP/1.1 MUST accept chunked-encoded responses.

Requests and Responses MUST occur over a single HTTP call instead of the server being allowed to dial back the client with a response at a later time. The response status code MUST be 200 if the RPC transaction succeeds, even when there's an error at the application layer, and a non-200 status code if the RPC transaction fails.

If a server chooses to respond to a single request message with a group of messages in the response it should do so as a set of `\n` delimited DAG-JSON messages (i.e. `{Response1}\n{Response2}...`).
If a server chooses to respond to a single request message with a group of DAG-JSON messages in the response it should do so as a set of `\n` delimited DAG-JSON messages (i.e. `{Response1}\n{Response2}...`).
DAG-CBOR responses require no special handling, as they are already self-delimiting due to the nature of the CBOR encoding.
Copy link
Member Author

@lidel lidel Sep 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💭 This should work in theory, but irl we may have constraints beyond regular CBOR.
Before we merge, we need to attempt at implementing this in https://github.com/ipld/edelweiss

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://www.rfc-editor.org/rfc/rfc8742.html states:

CBOR Sequences, unlike JSON Text Sequences [RFC7464], do not use a marker between items. This is possible because CBOR-encoded data items are self delimiting and the end can always be calculated. (Note that, while the early object/array-only form of JSON was self delimiting as well, this stopped being the case when simple values such as single numbers were made valid JSON documents.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, be very careful with this assumption .. we expect the delimiting to be done before the codec—i.e. you give the codec the exact bytes which match the CID and no more; otherwise we have a validation gap.

The only special-casing we've done that's related to this is that Edelweiss has a special flag that it uses for dag-json that lets it be sloppy with ending characters (spaces, EOL): https://github.com/ipld/go-ipld-prime/blob/7548eb883bda4712355797547a0628a0ad1c00cb/codec/dagjson/unmarshal.go#L36-L41 but not for additional data that might delimit the block. dag-cbor is very strict about not wanting extraneous bytes; in Go and JS.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean.. yeah, what other options do we have here?
We can't use \n or any other byte sequence as delimiter because it could be part of CBOR bytes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, well if you want to delimit by whole-value boundaries then we could work that into the dag-cbor decoder but it'll take a little bit of work and have to be a special option for it. If it's needed, then let's get an issue filed in go-ipld-prime and let me know what sort of priority it is and I'll start having a look.

IPIP/0000-http-reframe-cbor.md Outdated Show resolved Hide resolved
IPIP/0000-http-reframe-cbor.md Outdated Show resolved Hide resolved
IPIP/0000-http-reframe-cbor.md Outdated Show resolved Hide resolved
Alternative is to do nothing, and end up with:

- inconsitent user experience
- wasted bandwidth and cache storage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I think dag-cbor support would be net positive, but I think the perf argument for this is rather weak, there are lower hanging fruit like enabling response compression and we have quite voluminous headers in practice. I'm easy to convince here, it would just take some measurements with real delegated routing payloads using dag-json and dag-cbor encodings. I get the theoretical arguments and the evidence from that paper, but for perf it's still weak without direct measurements of the real workload.

Copy link
Member Author

@lidel lidel Oct 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made quick test with a single FindProvidersResponse and we get better gains than the paper suggests:

{"FindProvidersResponse":{"Providers":[{"Node":{"peer":{"ID":{"/":{"bytes":"EiAngCqwSSL46hQ5+DWaJsZ1SPV2RwrqwID/OEuj5Rdgqw"}},"Multiaddresses":[{"/":{"bytes":"NhFlbGFzdGljLmRhZy5ob3VzZQYBu94D"}}]}},"Proto":[{"2304":{}}]}]}}
  • baguqeerafsujii5p52wlc3fxrxhmbx6nu7uudyeznqjnvr7ogkjkndkuyukq
    DAG-JSON block is 223 bytes

  • bafyreignkd26ejp6jteqf4cggzbhp3twuvidqsa2lshknk6kjh7j2pc2h4
    DAG-CBOR block is 143 bytes

The diff is ~35%.

I think it makes sense, since DAG-JSON eats additional space by base64-encoding binary fields (all CIDs and Multiaddrs), which creates consistent ~33% overhead.

This comment was marked as duplicate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lidel : these numbers are uncompressed right? Aren't the numbers we need to compare when compression is employed since that is how the bytes will be sent on the wire (and is it how CDNs cache as well - I don't know).

Copy link
Member Author

@lidel lidel Oct 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BigLep depends on cache – afaik Nginx does not cache gzipped payload, but supports hosting of pre-compressed files if it finds static pre-compressed .gz variant, which does not help our use case.
Also, we do not have compression in libp2p.

Anyway, assuming we care only about HTTP, and go with default gzip for the samples above (iiuc cbor grows due to gzip envelope):

  • JSON: 197 bytes
    (ipfs block get baguqeerafsujii5p52wlc3fxrxhmbx6nu7uudyeznqjnvr7ogkjkndkuyukq | gzip -c | wc -c )
  • CBOR: 157 bytes
    (ipfs block get bafyreignkd26ejp6jteqf4cggzbhp3twuvidqsa2lshknk6kjh7j2pc2h4 | gzip -c | wc -c )

That seems to still give.. ~20% savings with CBOR.

IPIP/0000-http-reframe-cbor.md Outdated Show resolved Hide resolved
Co-authored-by: Gus Eggert <gus@gus.dev>
- `GET /reframe/{mbase64url-dag-cbor}`
- Cachable HTTP `GET` requests with message passed as DAG-CBOR in HTTP path segment, encoded as URL-safe [`base64url` multibase](https://docs.ipfs.io/concepts/glossary/#base64url) string
- DAG-CBOR in multibase `base64url` is used (even when request body is DAG-JSON) because JSON may include characters that are not safe to be used in URLs, and percent-encoding or base-encoding a big JSON query may take too much space.
- Suitable for sharing links, sending bigger messages, and when a query result MUST benefit from HTTP caching (see _HTTP Caching Considerations_ below).
- `GET /reframe?q={percent-encoded-dag-json}`
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBD, we've found another discrepancy between spec and implementation – see #333 and ipld/edelweiss#61

makes sense for POST too: alows for scaling, routing different methods
to different backend services
Requiring  Etag meant implementations naively buffered entire streaming
response. Relaxing this.
@lidel lidel changed the title IPIP: Support DAG-CBOR body in Reframe over HTTP IPIP: Reframe over HTTP version=2 (DAG-CBOR and better cache controls) Oct 19, 2022
@lidel
Copy link
Member Author

lidel commented Oct 19, 2022

Food for thought: https://atproto.com/specs/xrpc#requests (XRPC is part of social networking technology created by Bluesky) uses similar GET / POST distinction, and includes method name on path.

making it self-describing, reducing developer confusion we experienced
last week when folks started implementing version=1
Added things beyond DAG-CBOR, IPIP had to be updated
@lidel lidel changed the title IPIP: Reframe over HTTP version=2 (DAG-CBOR and better cache controls) IPIP-327: Reframe over HTTP version=2 (DAG-CBOR and better cache controls) Oct 26, 2022
@lidel
Copy link
Member Author

lidel commented Jan 12, 2023

My understanding is this is not happening, since IPFS Stewards replaced Reframe use with simplier API defined in #337.

Closing and moving to deferred column in https://github.com/orgs/ipfs/projects/19/views/1

@lidel lidel closed this Jan 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Deferred
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants