From 1b94688e45746ce3e705a8e11d87f7ce902039d5 Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Mon, 15 May 2023 14:33:10 +0200 Subject: [PATCH 01/12] IPIP-412: Signaling Block Order in CARs on Gateways First draft based on various prior art and recent discussions cited in the header front matter. --- src/http-gateways/trustless-gateway.md | 2 +- src/ipips/ipip-0402.md | 12 +- src/ipips/ipip-0412.md | 240 +++++++++++++++++++++++++ 3 files changed, 247 insertions(+), 7 deletions(-) create mode 100644 src/ipips/ipip-0412.md diff --git a/src/http-gateways/trustless-gateway.md b/src/http-gateways/trustless-gateway.md index 21e8b9e5e..e3ee0fe7f 100644 --- a/src/http-gateways/trustless-gateway.md +++ b/src/http-gateways/trustless-gateway.md @@ -69,7 +69,7 @@ Below response types MUST to be supported: - [application/vnd.ipld.raw](https://www.iana.org/assignments/media-types/application/vnd.ipld.raw) – requests a single, verifiable raw block to be returned Below response types SHOULD to be supported: -- [application/vnd.ipld.car](https://www.iana.org/assignments/media-types/application/vnd.ipld.car) – disables IPLD/IPFS deserialization, requests a verifiable CAR stream to be returned +- [application/vnd.ipld.car](https://www.iana.org/assignments/media-types/application/vnd.ipld.car) – disables IPLD/IPFS deserialization, requests a verifiable CAR stream to be returned, implementations MAY support optional parameters (:cite[ipip-0412]) - [application/vnd.ipfs.ipns-record](https://www.iana.org/assignments/media-types/application/vnd.ipfs.ipns-record) – requests a verifiable :cite[ipns-record] (multicodec `0x0300`). Gateway SHOULD return HTTP 400 Bad Request when running in strict trustless diff --git a/src/ipips/ipip-0402.md b/src/ipips/ipip-0402.md index 9176ae8a9..7de1f054b 100644 --- a/src/ipips/ipip-0402.md +++ b/src/ipips/ipip-0402.md @@ -88,7 +88,7 @@ Terse rationale for each feature: - Trustless HTTP Clients will be able to fetch a CAR with a file, byte range, or a directory enumeration using a way lower number of HTTP requests, which - will translate to improved resouce utilization, longer battery time on + will translate to improved resource utilization, longer battery time on mobile, and lower latency due to lower number of round trips. - CAR files downloaded from HTTP Gateways will always be end-to-end verifiable. @@ -104,7 +104,7 @@ Terse rationale for each feature: - [HTTP retrieval in Boost](https://boost.filecoin.io/http-retrieval) - [bifrost-gateway](https://github.com/ipfs/bifrost-gateway) -- Trustless Gateway is solidifed as the ecosystem wide standard. +- Trustless Gateway is solidified as the ecosystem wide standard. - IPIP tests added to [gateway-conformance](https://github.com/ipfs/gateway-conformance) test @@ -113,7 +113,7 @@ Terse rationale for each feature: - End users are empowered with primitives and tools that reduce retrieval cost, encourage self-hosting, or make validation of conformance claims of - free or comercial gateways possible. + free or commercial gateways possible. ### Compatibility @@ -174,7 +174,7 @@ Due to this, gateway specification changes introduced in this IPIP clarify that: - The CAR `roots` behavior is out of scope and flags that clients MAY ignore it. - CAR determinism is not present by default, responses may differ across requests and gateways. -- Opt-in determinism is possible, but standarized signaling mechanism does not +- Opt-in determinism is possible, but standardized signaling mechanism does not exist until we have IPIP-412 or similar. ### Security @@ -213,7 +213,7 @@ knows it has child entity named `index.html`, and everyone would pay a lower cos to lower number of blocks being returned in a single round-trip, instead of two. Rhea/Saturn projects requested this to be out of scope for now, but this "web" -entity scope could be added in the future, as a follow-up optimiziation IPIP. +entity scope could be added in the future, as a follow-up optimization IPIP. #### Requesting specific DAG depth @@ -234,7 +234,7 @@ Relevant tests were added to [gateway-conformance](https://github.com/ipfs/gateway-conformance) test suite in [#56](https://github.com/ipfs/gateway-conformance/pull/56) and [#85](https://github.com/ipfs/gateway-conformance/issues/85). -Detailed list of compliance checks for `dag-scope` and `entity-bytes` can be found in +A detailed list of compliance checks for `dag-scope` and `entity-bytes` can be found in [`v0.2.0/trustless_gateway_car_test.go`](https://github.com/ipfs/gateway-conformance/blob/v0.2.0/tests/trustless_gateway_car_test.go) or later. Below are CIDs, CARs, and short summary of each fixture. diff --git a/src/ipips/ipip-0412.md b/src/ipips/ipip-0412.md new file mode 100644 index 000000000..16ab2960a --- /dev/null +++ b/src/ipips/ipip-0412.md @@ -0,0 +1,240 @@ +--- +title: "IPIP-0412: Signaling Block Order in CARs on HTTP Gateways" +date: 2023-05-15 +ipip: proposal +editors: + - name: Marcin Rataj + github: lidel + url: https://lidel.org/ + - name: Jorropo + github: Jorropo +relatedIssues: + - https://github.com/ipfs/specs/issues/348 + - https://github.com/ipfs/specs/pull/330 + - https://github.com/ipfs/specs/pull/402 + - https://github.com/ipfs/specs/pull/412 +order: 412 +tags: ['ipips'] +--- + +## Summary + +Adds support for additional, optional content type options that allow the +client and server to signal or negotiate a specific block order in the returned +CAR. + +## Motivation + +We want to make it easier to build light-clients for IPFS. We want them to have +low memory footprints on arbitrary sized files. The main pain point preventing +this is the fact that CAR ordering isn't specified. + +This require to keeping some kind of reference either on disk, or in memory to +previously seen blocks for two reasons. + +1. Blocks can arrive out of order, meaning when a block is consumed (data is + red and returned to the consumer) and when it's received might not match. +1. Blocks can be reused multiple times, this is handy for cases when you plan + to cache on disk but not at all when you want to process a stream with use & + forget policy. + +What we really want is for the gateway to help us a bit, and give us blocks in +a useful order. + +The existing Trustless Gateway specification does not provide a mechanism for +negotiating the order of blocks in CAR responses. + +This IPIP aims to improve the status quo. + +## Detailed design + +CAR content type +([`application/vnd.ipld.car`](https://www.iana.org/assignments/media-types/application/vnd.ipld.car)) +already supports `version` parameter, which allows gateway to indicate which +CAR flavour is returned with the response. + +The proposed solution introduces two new parameters for the content type headers +in HTTP requests and responses: `order` and `dups`. + +The `order` parameter allows the client to indicate its preference for a +specific block order in the CAR response, and the `dups` parameter specifies +whether duplicate blocks are allowed in the response. + +### Signaling in Request + +Content type negotiation is based on section 12.5.1 of :cite[rfc9110]. + +Clients MAY indicate their preferred block order by sending an `Accept` header in +the HTTP request. The `Accept` header format is as follows: + +``` +Accept: application/vnd.ipld.car; version=1; order=dfs; dups=y +``` + +In the future, when more orders or parameters exist, clients will be able to +specify a list of preferences, for example: + +``` +Accept: application/vnd.ipld.car;order=foo, application/vnd.ipld.car;order=dfs;dups=y;q=0.5 +``` + +The above example is a list of preferences, the client would really like to use +the hypothetical `order=foo` however if this isn't available it would accept +`order=dfs` with `dups=y` instead (lower priority indicated via `q` parameter, +as noted in :cite[rfc9110]). + +#### `order` CAR content type parameter + +The `order` parameter accepts the following values: + +- `dfs`: [Depth-First Search](https://en.wikipedia.org/wiki/Depth-first_search) + order, allows for streaming responses with minimal memory usage +- `rnd`: Unknown (random) order, the implicit default when `order` parameter is missing. + +#### `dups` CAR content type parameter + +The `dups` parameter specifies whether duplicate blocks (the same block +occuring multiple times in the requested DAG) will be present in the CAR +response. + +It accepts two values: +- `y`: duplicate blocks are allowed +- `n`: duplicates are not allowed + +When allowed (`y`), light clients are able to discard blocks after +reading them, removing the need for caching in-memory or on-disk. + + + +### Signaling in Response + +The Trustless Gateway MUST always respond with a `Content-Type` header that includes +information about all supported/known parameters, even if the client did not +specify them in the request. + +The `Content-Type` header format is as follows: + +``` +Content-Type: application/vnd.ipld.car;version=1;order=dfs;dups=y +``` + + +Gateway implementations are free to decide on the implicit default ordering or +other parameters, and use it in responses when client did not explicitly +specify, or requested unsupported or unknown query parameter. + +Implementations MAY choose to implement only some of the parameters. + +## Design rationale + +The proposed specification change aims to address the limitations of the +existing Trustless Gateway specification by introducing a mechanism for +negotiating the block order in CAR responses. + +By allowing clients to indicate their preferred block order, Trustless Gateways +can cache CAR responses for popular content, resulting in improved performance +and reduced network load. Clients benefit from more efficient data handling by +deserializing blocks as they arrive, + +We reuse exiting HTTP content type negotiation, and the CAR content type, which +already had the optional `version` parameter. + +### User benefit + +The proposed specification change brings several benefits to end users: + +1. Improved Performance: Gateways can decide on their implicit default ordering + and cache CAR responses for popular content. In turn, clients can benefit + from strong `Etag` in ordered (deterministic) responses. This reduces the + response time for subsequent requests, resulting in faster content retrieval + for users. + +2. Reduced Memory Usage: Clients no longer need to buffer the entire CAR + response in memory until the deserialization of the requested entity is + finished. With the ability to deserialize blocks as they arrive, users can + conserve memory resources, especially when dealing with large CAR responses. + +3. Efficient Data Handling: By discarding blocks as soon as the CID is + validated and data is deserialized, clients can efficiently process the data + in real-time. This is particularly useful for light clients, IoT devices, + mobile web browsers, and other streaming applications where immediate access + to the data is required. + +4. Customizable Ordering: Clients can indicate their preferred block order in the + `Accept` header, allowing them to prioritize specific ordering strategies that + align with their use cases. This flexibility enhances the user experience + and empowers users to optimize content retrieval according to their needs. + +### Compatibility + +The proposed specification change is backward compatible with existing client +and server implementations. + +Trustless Gateways that do not support the negotiation of block order in CAR +responses will continue to function as before, providing their existing default +behavior, and the clients will be able to detect it by inspecting the +`Content-Type` header present in HTTP response. + +Clients that do not send the `Accept` header or do not recognize the `order` +and `dups` parameters in the `Content-Type` header will receive and process CAR +responses as they did before: buffering/caching all blocks until done with the +final deserialization. + +Existing implementations can choose to adopt the new specification and +implement support for the negotiation of block order incrementally. This allows +for a smooth transition and ensures compatibility with both new and old +clients. + +### Security + +The proposed specification change does not introduce any negative security +implications beyond those already present in the existing Trustless Gateway +specification. It focuses on enhancing performance and data handling without +affecting the underlying security model of IPFS. + +Light clients with support for `order` and `dups` CAR content type parameters +will be able to detect malicious response faster, reducing risks of +memory-based DoS attacks from malicious gateways. + +### Alternatives + +Several alternative approaches were considered before arriving at the proposed solution: + +1. Implicit Server-Side Configuration: Instead of negotiating the block order, + in the CAR response, the Trustless Gateway could have a server-side + configuration that specifies the default order. However, this approach would + limit the flexibility for clients, requiring them to have prior knowledge + about order supported by each gateway. + +2. Fixed Block Order: Another option was to enforce a fixed block order in the + CAR responses. However, this approach would not cater to the varying needs + and preferences of different clients and use cases, and is not backward + compatible with the existing Trustless Gateways which return CAR responses + with Weak `Etag` and unspecified block order. + +3. Separate `X-` HTTP Header: Introduction of a separate HTTP reader was + rejected because we try to use HTTP semantics where possible, and gateways + already use HTTP content type negotiation for CAR `version` and reusing it + saves a few bytes in each round-trip. Also, :cite[rfc6648] advises against + use of `X-` and similar constructs in new protocols. + +The proposed solution of negotiating the block order through headers si +future-proof, allows for flexibility, interoperability, and customization while +maintaining compatibility with existing implementations. + +## Test fixtures + +Implementation compliance can be determined by testing the negotiation process +between clients and Trustless Gateways using various combinations of `order` and +`dups` parameters. + +TODO: +1. a CAR with blocks for a small file in DFS order +2. a CAR with blocks for a small file with one block appearing twice + + +### Copyright + +Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). From c754850daed9277026c5ebbc399356cffaa5464a Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Tue, 16 May 2023 22:22:12 +0200 Subject: [PATCH 02/12] ipip-0412: clarify dups CAR parameter --- src/ipips/ipip-0412.md | 22 +++++++++++++++------- 1 file changed, 15 insertions(+), 7 deletions(-) diff --git a/src/ipips/ipip-0412.md b/src/ipips/ipip-0412.md index 16ab2960a..ef7ab4b53 100644 --- a/src/ipips/ipip-0412.md +++ b/src/ipips/ipip-0412.md @@ -56,9 +56,9 @@ CAR flavour is returned with the response. The proposed solution introduces two new parameters for the content type headers in HTTP requests and responses: `order` and `dups`. -The `order` parameter allows the client to indicate its preference for a -specific block order in the CAR response, and the `dups` parameter specifies -whether duplicate blocks are allowed in the response. +The `order` parameter lets the client specify the desired block order in the +CAR response, while the `dups` parameter determines if a block is sent multiple +times when it appears more than once in the requested DAG. ### Signaling in Request @@ -95,15 +95,23 @@ The `order` parameter accepts the following values: The `dups` parameter specifies whether duplicate blocks (the same block occuring multiple times in the requested DAG) will be present in the CAR -response. +response. Useful when a deterministic block order is used. It accepts two values: -- `y`: duplicate blocks are allowed -- `n`: duplicates are not allowed +- `y`: Duplicate blocks MUST be sent every time they occur during the DAG walk. +- `n`: Duplicate blocks MUST be sent only once. -When allowed (`y`), light clients are able to discard blocks after +When set to `y`, light clients are able to discard blocks after reading them, removing the need for caching in-memory or on-disk. +Setting to `n` allows for more efficient data transfer of certain types of data, +but introduces additional resource cost on the receiving end. + +If the `dups` parameter is not present in the `Content-Type` header, the +behavior is unspecified, and the CAR response includes an arbitrary list of +blocks. In this case, the client should assume `n` as the default, but ignore +duplicates if they are present. + From dc36de2ac85b0394ac66e5ace67e779448809b15 Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Wed, 17 May 2023 10:12:10 +0200 Subject: [PATCH 03/12] ipip-0412: clarify car order=unk https://github.com/ipfs/specs/pull/412#discussion_r1194061890 --- src/ipips/ipip-0412.md | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/src/ipips/ipip-0412.md b/src/ipips/ipip-0412.md index ef7ab4b53..fa0f9e248 100644 --- a/src/ipips/ipip-0412.md +++ b/src/ipips/ipip-0412.md @@ -85,11 +85,15 @@ as noted in :cite[rfc9110]). #### `order` CAR content type parameter -The `order` parameter accepts the following values: +The `order` parameter allows clients to specify the desired block order in the +response. It supports the following values: - `dfs`: [Depth-First Search](https://en.wikipedia.org/wiki/Depth-first_search) - order, allows for streaming responses with minimal memory usage -- `rnd`: Unknown (random) order, the implicit default when `order` parameter is missing. + order, enables streaming responses with minimal memory usage. +- `unk`: Unknown order, which serves as the implicit default when the order + parameter is missing. In this case, the client cannot make any assumptions + about the block order: blocks may arrive in a random order or be a result of + a custom DAG traversal algorithm. #### `dups` CAR content type parameter From 312df5706d6ba38b71b476a944cf26de416f4130 Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Wed, 17 May 2023 15:57:07 +0200 Subject: [PATCH 04/12] ipip-412: HTTP 406 on unsupported CAR param --- src/ipips/ipip-0412.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/ipips/ipip-0412.md b/src/ipips/ipip-0412.md index fa0f9e248..9bd8ebebc 100644 --- a/src/ipips/ipip-0412.md +++ b/src/ipips/ipip-0412.md @@ -135,9 +135,10 @@ Content-Type: application/vnd.ipld.car;version=1;order=dfs;dups=y Gateway implementations are free to decide on the implicit default ordering or other parameters, and use it in responses when client did not explicitly -specify, or requested unsupported or unknown query parameter. +specify any matching preference. -Implementations MAY choose to implement only some of the parameters. +Implementations MAY choose to implement only some of the parameters and return +HTTP 406 Not Acceptable when client requested a response with unsupported one. ## Design rationale From 205a04d9d9014dea01dfe031999accae9ea8395a Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Wed, 17 May 2023 17:26:02 +0200 Subject: [PATCH 05/12] ipip-412: explicitly forbid identity cid blocks --- src/ipips/ipip-0412.md | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/src/ipips/ipip-0412.md b/src/ipips/ipip-0412.md index 9bd8ebebc..a1519a311 100644 --- a/src/ipips/ipip-0412.md +++ b/src/ipips/ipip-0412.md @@ -116,9 +116,16 @@ behavior is unspecified, and the CAR response includes an arbitrary list of blocks. In this case, the client should assume `n` as the default, but ignore duplicates if they are present. - +:::warning + +The specified parameter does not apply to virtual blocks identified by identity +CIDs. CAR responses MUST never include these virtual blocks. The parameter in +question is meant to control the behavior of non-virtual blocks in the +response. Therefore, it does not have any effect on virtual blocks, and they +should never be included in the CAR response, no matter if present, or what +value is set. + +::: ### Signaling in Response From 51cc1774d5ce6b98b02ec50ac8701caaedaa1c45 Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Wed, 17 May 2023 17:28:52 +0200 Subject: [PATCH 06/12] chore: typo Co-authored-by: Oli Evans --- src/ipips/ipip-0412.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/ipips/ipip-0412.md b/src/ipips/ipip-0412.md index a1519a311..664d08b18 100644 --- a/src/ipips/ipip-0412.md +++ b/src/ipips/ipip-0412.md @@ -33,7 +33,7 @@ This require to keeping some kind of reference either on disk, or in memory to previously seen blocks for two reasons. 1. Blocks can arrive out of order, meaning when a block is consumed (data is - red and returned to the consumer) and when it's received might not match. + read and returned to the consumer) and when it's received might not match. 1. Blocks can be reused multiple times, this is handy for cases when you plan to cache on disk but not at all when you want to process a stream with use & forget policy. From 742d64122db92827dca5525b59efad1c146aa576 Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Mon, 15 May 2023 14:33:10 +0200 Subject: [PATCH 07/12] IPIP-412: Signaling Block Order in CARs on Gateways First draft based on various prior art and recent discussions cited in the header front matter. --- src/ipips/ipip-0412.md | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/src/ipips/ipip-0412.md b/src/ipips/ipip-0412.md index 664d08b18..c0b20beda 100644 --- a/src/ipips/ipip-0412.md +++ b/src/ipips/ipip-0412.md @@ -34,6 +34,7 @@ previously seen blocks for two reasons. 1. Blocks can arrive out of order, meaning when a block is consumed (data is read and returned to the consumer) and when it's received might not match. + 1. Blocks can be reused multiple times, this is handy for cases when you plan to cache on disk but not at all when you want to process a stream with use & forget policy. @@ -56,9 +57,9 @@ CAR flavour is returned with the response. The proposed solution introduces two new parameters for the content type headers in HTTP requests and responses: `order` and `dups`. -The `order` parameter lets the client specify the desired block order in the -CAR response, while the `dups` parameter determines if a block is sent multiple -times when it appears more than once in the requested DAG. +The `order` parameter allows the client to indicate its preference for a +specific block order in the CAR response, and the `dups` parameter specifies +whether duplicate blocks are allowed in the response. ### Signaling in Request @@ -127,6 +128,10 @@ value is set. ::: + + ### Signaling in Response The Trustless Gateway MUST always respond with a `Content-Type` header that includes From 02a8465218c412dadf7dbae06a178114b23c9892 Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Thu, 6 Jul 2023 17:46:54 +0200 Subject: [PATCH 08/12] ipip-412: refactor on top of ipip-402 - moving spec details to trustless-gateway - rebasing on top of ipip-402 - fixing linter --- src/http-gateways/path-gateway.md | 6 +- src/http-gateways/trustless-gateway.md | 186 +++++++++++++++++++++---- src/ipips/ipip-0412.md | 110 +++------------ 3 files changed, 181 insertions(+), 121 deletions(-) diff --git a/src/http-gateways/path-gateway.md b/src/http-gateways/path-gateway.md index e23061ddb..61aa6b568 100644 --- a/src/http-gateways/path-gateway.md +++ b/src/http-gateways/path-gateway.md @@ -595,11 +595,7 @@ The following response types require an explicit opt-in, can only be requested w - Raw Block (`?format=raw`) - Opaque bytes, see [application/vnd.ipld.raw](https://www.iana.org/assignments/media-types/application/vnd.ipld.raw). - CAR (`?format=car`) - - A CAR file or a stream that contains all blocks required to trustlessly verify the requested content path query, see [application/vnd.ipld.car](https://www.iana.org/assignments/media-types/application/vnd.ipld.car) and :cite[trustless-gateway]. - - **Note:** by default, block order in CAR response is not deterministic, - blocks can be returned in different order, depending on implementation - choices (traversal, speed at which blocks arrive from the network, etc). - An opt-in ordered CAR responses MAY be introduced in a future IPIP. + - A CAR file or a stream that contains all blocks required to trustlessly verify the requested content path query, see [application/vnd.ipld.car](https://www.iana.org/assignments/media-types/application/vnd.ipld.car) and Section 5 (CAR Responses) at :cite[trustless-gateway]. - TAR (`?format=tar`) - Deserialized UnixFS files and directories as a TAR file or a stream, see :cite[ipip-0288]. - IPNS Record diff --git a/src/http-gateways/trustless-gateway.md b/src/http-gateways/trustless-gateway.md index e3ee0fe7f..ba073eda0 100644 --- a/src/http-gateways/trustless-gateway.md +++ b/src/http-gateways/trustless-gateway.md @@ -13,6 +13,10 @@ editors: - name: Henrique Dias github: hacdias url: https://hacdias.com/ +xref: + - url + - path-gateway + - ipip-0412 tags: ['httpGateways', 'lowLevelHttpGateways'] order: 1 --- @@ -25,11 +29,11 @@ The minimal implementation means: - response type is always fully verifiable: client can decide between a raw block or a CAR stream - no UnixFS/IPLD deserialization -- for CAR files: - - the behavior is identical to :cite[path-gateway] - for raw blocks: - data is requested by CID, only supported path is `/ipfs/{cid}` - no path traversal or recursive resolution +- for CAR files: + - the pathing behavior is identical to :cite[path-gateway] # HTTP API @@ -63,14 +67,23 @@ Same as in :cite[path-gateway], but with limited number of supported response ty ### `Accept` (request header) -This HTTP header is required when running in a strict, trustless mode. +A Client SHOULD sent this HTTP header to leverage content type negotiation +based on section 12.5.1 of :cite[rfc9110]. Below response types MUST to be supported: -- [application/vnd.ipld.raw](https://www.iana.org/assignments/media-types/application/vnd.ipld.raw) – requests a single, verifiable raw block to be returned + +- [application/vnd.ipld.raw](https://www.iana.org/assignments/media-types/application/vnd.ipld.raw) + - A single, verifiable raw block to be returned. Below response types SHOULD to be supported: -- [application/vnd.ipld.car](https://www.iana.org/assignments/media-types/application/vnd.ipld.car) – disables IPLD/IPFS deserialization, requests a verifiable CAR stream to be returned, implementations MAY support optional parameters (:cite[ipip-0412]) -- [application/vnd.ipfs.ipns-record](https://www.iana.org/assignments/media-types/application/vnd.ipfs.ipns-record) – requests a verifiable :cite[ipns-record] (multicodec `0x0300`). + +- [application/vnd.ipld.car](https://www.iana.org/assignments/media-types/application/vnd.ipld.car) + - Disables IPLD/IPFS deserialization, requests a verifiable CAR stream to be + returned, implementations MAY support optional CAR content type parameters + (:cite[ipip-0412]) and the explicit [CAR format signaling in HTTP Request](#car-format-signaling-in-request). + +- [application/vnd.ipfs.ipns-record](https://www.iana.org/assignments/media-types/application/vnd.ipfs.ipns-record) + - A verifiable :cite[ipns-record] (multicodec `0x0300`). Gateway SHOULD return HTTP 400 Bad Request when running in strict trustless mode (no deserialized responses) and `Accept` header is missing. @@ -168,35 +181,37 @@ Below MUST be implemented **in addition** to "HTTP Response" of :cite[path-gatew MUST be returned and include additional format-specific parameters when possible. -If a CAR stream was requested, the response MUST include the parameter specifying CAR version. -For example: `Content-Type: application/vnd.ipld.car; version=1` +If a CAR stream was requested: +- the response MUST include the parameter specifying CAR version. For example: + `Content-Type: application/vnd.ipld.car; version=1` +- the response SHOULD include additional content type parameters, as noted in + [CAR format signaling in Response](#car-format-signaling-in-response). ### `Content-Disposition` (response header) MUST be returned and set to `attachment` to ensure requested bytes are not rendered by a web browser. -## Response Payload - -### Block Response +# Block Responses (application/vnd.ipld.raw) An opaque bytes matching the requested block CID ([application/vnd.ipld.raw](https://www.iana.org/assignments/media-types/application/vnd.ipld.raw)). The Body hash MUST match the Multihash from the requested CID. -### CAR Response +# CAR Responses (application/vnd.ipld.car) A CAR stream for the requested [application/vnd.ipld.car](https://www.iana.org/assignments/media-types/application/vnd.ipld.car) -content type, path and optional `dag-scope` and `entity-bytes` URL parameters. +content type (with optional `order` and `dups` params), path and optional +`dag-scope` and `entity-bytes` URL parameters. -#### CAR version +## CAR version Value returned in [`CarV1Header.version`](https://ipld.io/specs/transport/car/carv1/#header) field MUST match the `version` parameter returned in `Content-Type` header. -#### CAR roots +## CAR roots The behavior associated with the [`CarV1Header.roots`](https://ipld.io/specs/transport/car/carv1/#header) field @@ -210,27 +225,146 @@ As of 2023-06-20, the behavior of the `roots` CAR field remains an [unresolved ::: -#### CAR determinism +## CAR `order` (content type parameter) + +The `order` parameter allows clients to specify the desired block order in the +response. It supports the following values: + +- `dfs`: [Depth-First Search](https://en.wikipedia.org/wiki/Depth-first_search) + order, enables streaming responses with minimal memory usage. +- `unk` (or missing): Unknown order, which serves as the implicit default when the `order` + parameter is unspecified. In this case, the client cannot make any assumptions + about the block order: blocks may arrive in a random order or be a result of + a custom DAG traversal algorithm. + +A Gateway SHOULD always return explicit `order` in CAR's `Content-Type` response header. + +A Gateway MAY skip `order` in CAR response if no order was explicitly requested +by the client and the default order is unknown. + +A Client MUST assume implicit `order=unk` when `order` is missing, unknown, or empty. + +## CAR `dups` (content type parameter) -The default CAR header and block order in a CAR response is not specified and is non-deterministic. +The `dups` parameter specifies whether duplicate blocks (the same block +occurring multiple times in the requested DAG) will be present in the CAR +response. Useful when a deterministic block order is used. + +It accepts two values: +- `y`: Duplicate blocks MUST be sent every time they occur during the DAG walk. +- `n`: Duplicate blocks MUST be sent only once. + +When set to `y`, light clients are able to discard blocks after +reading them, removing the need for caching in-memory or on-disk. + +Setting to `n` allows for more efficient data transfer of certain types of +data, but introduces additional resource cost on the receiving end, as each +block needs to be kept around in case its CID appears again. + +A Client MUST not assume any implicit behavior when `dups` is missing. + +If the `dups` parameter is not present in the `Content-Type` header, the +behavior is unspecified, and the CAR response includes an arbitrary list of +blocks. In this unknown state, the client MUST assume duplicates are not sent, +but also MUST ignore duplicates if they are present. + +A Gateway MUST return always return `dups` in `Content-Type` response header +when the duplicate status is known at the time of response. + +A Gateway MAY skip `dups` if it was not present in `Accept` header sent by the +client or if it is not possible to tell the duplicate status. + +:::warning + +The specified parameter does not apply to virtual blocks identified by identity +CIDs. CAR responses MUST never include these virtual blocks. The parameter in +question is meant to control the behavior of non-virtual blocks in the +response. Therefore, it does not have any effect on virtual blocks, and they +should never be included in the CAR response, no matter if present, or what +value is set. + +::: + +## CAR format parameters and determinism + +The default header and block order in a CAR format is not specified by IPLD specifications. Clients MUST NOT assume that CAR responses are deterministic (byte-for-byte identical) across different gateways. Clients MUST NOT assume that CAR includes CIDs and their blocks in the same order across different gateways. +Clients MUST assume block order and duplicate status only if `Content-Type` returned with CAR responses includes optional `order` or `dups` parameters, as specified by :cite[ipip-0412]. + +A Gateway SHOULD support some aspects of determinism by implementing content type negotiation and signaling via `Accept` and `Content-Type` headers. + :::issue -In controlled environments, clients MAY choose to rely on undocumented CAR determinism, -subject to the agreement of the following conditions between the client and the -gateway: +In controlled environments, clients MAY choose to rely on implicit and +undocumented CAR determinism, subject to the agreement of the following +conditions between the client and the gateway: - CAR version - content of [`CarV1Header.roots`](https://ipld.io/specs/transport/car/carv1/#header) field -- order of blocks -- status of duplicate blocks +- order of blocks (`order` from :cite[ipip-0412]) +- status of duplicate blocks (`dups` from :cite[ipip-0412]) -In the future, there may be an introduction of a convention to indicate aspects -of determinism in CAR responses. Please refer to -[IPIP-412](https://github.com/ipfs/specs/pull/412) for potential developments -in this area. +Mind this is undocumented behavior, and MUST NOT be used on public networks. ::: + +### CAR format signaling in Request + +Content type negotiation is based on section 12.5.1 of :cite[rfc9110]. + +Clients MAY indicate their preferred block order by sending an `Accept` header in +the HTTP request. The `Accept` header format is as follows: + +``` +Accept: application/vnd.ipld.car; version=1; order=dfs; dups=y +``` + +In the future, when more orders or parameters exist, clients will be able to +specify a list of preferences, for example: + +``` +Accept: application/vnd.ipld.car;order=foo, application/vnd.ipld.car;order=dfs;dups=y;q=0.5 +``` + +The above example is a list of preferences, the client would really like to use +the hypothetical `order=foo` however if this isn't available it would accept +`order=dfs` with `dups=y` instead (lower priority indicated via `q` parameter, +as noted in :cite[rfc9110]). + +### CAR format signaling in Response + +The Trustless Gateway MUST always respond with a `Content-Type` header that includes +information about all supported and known parameters, even if the client did not +specify them in the request. + +The `Content-Type` header format is as follows: + +``` +Content-Type: application/vnd.ipld.car;version=1;order=dfs;dups=n +``` + +Gateway implementations SHOULD decide on the implicit default ordering or +other parameters, and use it in responses when client did not explicitly +specify any matching preference. + +A Gateway MAY choose to implement only some of the parameters and return HTTP +400 Bad Request or 406 Not Acceptable when client requested a response with +unsupported content type variant. + +A Client MUST verify `Content-Type` returned with CAR response before +processing the payload, as the legacy gateway may not support optional content +type parameters like `order` an `dups` and return plain +`application/vnd.ipld.car`. + +# IPNS Record Responses (application/vnd.ipfs.ipns-record) + +An opaque bytes matching the [Signed IPNS Record](https://specs.ipfs.tech/ipns/ipns-record/#ipns-record) +for the requested [IPNS Name](https://specs.ipfs.tech/ipns/ipns-record/#ipns-name) +returned as [application/vnd.ipfs.ipns-record](https://www.iana.org/assignments/media-types/application/vnd.ipfs.ipns-record). + +A Client MUST confirm the record signature match `libp2p-key` from the requested IPNS Name. + +A Client MUST [perform additional record verification according to the IPNS specification](https://specs.ipfs.tech/ipns/ipns-record/#record-verification). diff --git a/src/ipips/ipip-0412.md b/src/ipips/ipip-0412.md index c0b20beda..94d42ddf0 100644 --- a/src/ipips/ipip-0412.md +++ b/src/ipips/ipip-0412.md @@ -61,96 +61,10 @@ The `order` parameter allows the client to indicate its preference for a specific block order in the CAR response, and the `dups` parameter specifies whether duplicate blocks are allowed in the response. -### Signaling in Request +A Client SHOULD sent `Accept` HTTP header to leverage content type negotiation +based on section 12.5.1 of :cite[rfc9110] to get the preferred response type. -Content type negotiation is based on section 12.5.1 of :cite[rfc9110]. - -Clients MAY indicate their preferred block order by sending an `Accept` header in -the HTTP request. The `Accept` header format is as follows: - -``` -Accept: application/vnd.ipld.car; version=1; order=dfs; dups=y -``` - -In the future, when more orders or parameters exist, clients will be able to -specify a list of preferences, for example: - -``` -Accept: application/vnd.ipld.car;order=foo, application/vnd.ipld.car;order=dfs;dups=y;q=0.5 -``` - -The above example is a list of preferences, the client would really like to use -the hypothetical `order=foo` however if this isn't available it would accept -`order=dfs` with `dups=y` instead (lower priority indicated via `q` parameter, -as noted in :cite[rfc9110]). - -#### `order` CAR content type parameter - -The `order` parameter allows clients to specify the desired block order in the -response. It supports the following values: - -- `dfs`: [Depth-First Search](https://en.wikipedia.org/wiki/Depth-first_search) - order, enables streaming responses with minimal memory usage. -- `unk`: Unknown order, which serves as the implicit default when the order - parameter is missing. In this case, the client cannot make any assumptions - about the block order: blocks may arrive in a random order or be a result of - a custom DAG traversal algorithm. - -#### `dups` CAR content type parameter - -The `dups` parameter specifies whether duplicate blocks (the same block -occuring multiple times in the requested DAG) will be present in the CAR -response. Useful when a deterministic block order is used. - -It accepts two values: -- `y`: Duplicate blocks MUST be sent every time they occur during the DAG walk. -- `n`: Duplicate blocks MUST be sent only once. - -When set to `y`, light clients are able to discard blocks after -reading them, removing the need for caching in-memory or on-disk. - -Setting to `n` allows for more efficient data transfer of certain types of data, -but introduces additional resource cost on the receiving end. - -If the `dups` parameter is not present in the `Content-Type` header, the -behavior is unspecified, and the CAR response includes an arbitrary list of -blocks. In this case, the client should assume `n` as the default, but ignore -duplicates if they are present. - -:::warning - -The specified parameter does not apply to virtual blocks identified by identity -CIDs. CAR responses MUST never include these virtual blocks. The parameter in -question is meant to control the behavior of non-virtual blocks in the -response. Therefore, it does not have any effect on virtual blocks, and they -should never be included in the CAR response, no matter if present, or what -value is set. - -::: - - - -### Signaling in Response - -The Trustless Gateway MUST always respond with a `Content-Type` header that includes -information about all supported/known parameters, even if the client did not -specify them in the request. - -The `Content-Type` header format is as follows: - -``` -Content-Type: application/vnd.ipld.car;version=1;order=dfs;dups=y -``` - - -Gateway implementations are free to decide on the implicit default ordering or -other parameters, and use it in responses when client did not explicitly -specify any matching preference. - -Implementations MAY choose to implement only some of the parameters and return -HTTP 406 Not Acceptable when client requested a response with unsupported one. +More details in Section 5. (CAR Responses) of :cite[trustless-gateway]. ## Design rationale @@ -245,7 +159,7 @@ Several alternative approaches were considered before arriving at the proposed s saves a few bytes in each round-trip. Also, :cite[rfc6648] advises against use of `X-` and similar constructs in new protocols. -The proposed solution of negotiating the block order through headers si +The proposed solution of negotiating the block order through headers is future-proof, allows for flexibility, interoperability, and customization while maintaining compatibility with existing implementations. @@ -255,10 +169,26 @@ Implementation compliance can be determined by testing the negotiation process between clients and Trustless Gateways using various combinations of `order` and `dups` parameters. +Relevant tests were added to +[gateway-conformance](https://github.com/ipfs/gateway-conformance) test suite +in [#87](https://github.com/ipfs/gateway-conformance/pull/87). + + + +Below are CIDs, CARs, and short summary of each fixture. + TODO: 1. a CAR with blocks for a small file in DFS order 2. a CAR with blocks for a small file with one block appearing twice +Tests for duplicates use a fixture where a directory contains two files that +are the same. If `dups=n`, then there are no duplicates. If `dups=y`, then the +blocks of the file are sent twice, by the order they show up in the DAG. + +The same fixture is used for testing `order=dfs`. ### Copyright From fb214aec60ce69c62f8ec51cfa42837b3758f8a0 Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Thu, 6 Jul 2023 20:10:51 +0200 Subject: [PATCH 09/12] ipip-412: note why we kept separate parameters https://github.com/ipfs/specs/pull/412#discussion_r1231406682 --- src/ipips/ipip-0412.md | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/src/ipips/ipip-0412.md b/src/ipips/ipip-0412.md index 94d42ddf0..b8bd5cc00 100644 --- a/src/ipips/ipip-0412.md +++ b/src/ipips/ipip-0412.md @@ -159,6 +159,19 @@ Several alternative approaches were considered before arriving at the proposed s saves a few bytes in each round-trip. Also, :cite[rfc6648] advises against use of `X-` and similar constructs in new protocols. +4. The decision to not implement a single preset pack with predefined behavior, + instead of separate parameters for order and duplicates (dups), was driven + by considerations of ambiguity and potential future problems when adding + more determinism to responses. For instance, if we were to include a new + behavior like `foo=y|n` alongside an existing preset like `pack=orderdfs+dupsy`, + it would either necessitate the addition of a separate parameter or impose + the adoption of a new version of every preset (e.g., `orderdfs-dupsy+fooy` and + `orderdfs+dupsy+foon`). Maintaining and deploying such changes across a + decentralized ecosystem, where gateways may operate on different software, + becomes more complex. In contrast, utilizing separate parameters for each + behavior enables easier maintenance and deployment in a decentralized + ecosystem with varying gateway software. + The proposed solution of negotiating the block order through headers is future-proof, allows for flexibility, interoperability, and customization while maintaining compatibility with existing implementations. From 1d4a263fe84f31ae33e2c35a61d1ab2135e12c1e Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Wed, 12 Jul 2023 22:33:41 +0200 Subject: [PATCH 10/12] ipip-412: clarify default behaviors --- src/http-gateways/trustless-gateway.md | 40 ++++++++++++++------------ 1 file changed, 21 insertions(+), 19 deletions(-) diff --git a/src/http-gateways/trustless-gateway.md b/src/http-gateways/trustless-gateway.md index ba073eda0..3ae339850 100644 --- a/src/http-gateways/trustless-gateway.md +++ b/src/http-gateways/trustless-gateway.md @@ -261,29 +261,31 @@ Setting to `n` allows for more efficient data transfer of certain types of data, but introduces additional resource cost on the receiving end, as each block needs to be kept around in case its CID appears again. +If the `dups` parameter is absent from the `Accept` request header, the +behavior is unspecified. In such cases, a Gateway should respond with `dups=n` +if it has control over the duplicate status, or without `dups` parameter if it +does not. +Defaulting to the inclusion of duplicate blocks (`dups=y`) SHOULD only be +implemented by Gateway systems that exclusively support `dups=y` and do not +support any other behavior. + A Client MUST not assume any implicit behavior when `dups` is missing. -If the `dups` parameter is not present in the `Content-Type` header, the +If the `dups` parameter is absent from the `Content-Type` response header, the behavior is unspecified, and the CAR response includes an arbitrary list of blocks. In this unknown state, the client MUST assume duplicates are not sent, -but also MUST ignore duplicates if they are present. - -A Gateway MUST return always return `dups` in `Content-Type` response header -when the duplicate status is known at the time of response. - -A Gateway MAY skip `dups` if it was not present in `Accept` header sent by the -client or if it is not possible to tell the duplicate status. - -:::warning - -The specified parameter does not apply to virtual blocks identified by identity -CIDs. CAR responses MUST never include these virtual blocks. The parameter in -question is meant to control the behavior of non-virtual blocks in the -response. Therefore, it does not have any effect on virtual blocks, and they -should never be included in the CAR response, no matter if present, or what -value is set. - -::: +but also MUST ignore duplicates and other unexpected blocks if they are present. + +A Gateway MUST always return `dups` in `Content-Type` response header +when the duplicate status is known at the time of processing the request. +A Gateway SHOULD not return `dups` if determining the duplicate status is not +possible at the time of processing the request. + +A Gateway MUST NOT include virtual blocks identified by identity CIDs +(multihash with `0x00` code) in CAR responses. This exclusion applies regardless +of their presence in the DAG or the value assigned to the "dups" parameter, as +the raw data is already present in the parent block that links to the identity +CID. ## CAR format parameters and determinism From 6c649a192ea36796f692d199a851e8e70cf550d2 Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Wed, 12 Jul 2023 23:04:10 +0200 Subject: [PATCH 11/12] ipip-412: clarify CARv1 roots and content types --- src/http-gateways/trustless-gateway.md | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/src/http-gateways/trustless-gateway.md b/src/http-gateways/trustless-gateway.md index 617cbd6ea..6a99b58b1 100644 --- a/src/http-gateways/trustless-gateway.md +++ b/src/http-gateways/trustless-gateway.md @@ -41,11 +41,11 @@ A subset of "HTTP API" of :cite[path-gateway]. ## `GET /ipfs/{cid}[/{path}][?{params}]` -Downloads verifiable data for the specified **immutable** content path. +Downloads verifiable, content-addressed data for the specified **immutable** content path. -Optional `path` is permitted for requests that specify CAR format (`application/vnd.ipld.car`). +Optional `path` is permitted for requests that specify CAR format (`?format=car` or `Accept: application/vnd.ipld.car`). -For RAW requests, only `GET /ipfs/{cid}[?{params}]` is supported. +For block requests (`?format=raw` or `Accept: application/vnd.ipld.raw`), only `GET /ipfs/{cid}[?{params}]` is supported. ## `HEAD /ipfs/{cid}[/{path}][?{params}]` @@ -53,7 +53,7 @@ Same as GET, but does not return any payload. ## `GET /ipns/{key}[?{params}]` -Downloads data at specified IPNS Key. Verifiable :cite[ipns-record] can be requested via `?format=ipns-record` +Downloads data at specified IPNS Key. Verifiable :cite[ipns-record] can be requested via `?format=ipns-record` or `Accept: application/vnd.ipfs.ipns-record`. ## `HEAD /ipns/{key}[?{params}]` @@ -85,7 +85,7 @@ Below response types SHOULD to be supported: - [application/vnd.ipfs.ipns-record](https://www.iana.org/assignments/media-types/application/vnd.ipfs.ipns-record) - A verifiable :cite[ipns-record] (multicodec `0x0300`). -Gateway SHOULD return HTTP 400 Bad Request when running in strict trustless +A Gateway SHOULD return HTTP 400 Bad Request when running in strict trustless mode (no deserialized responses) and `Accept` header is missing. ## Request Query Parameters @@ -229,7 +229,9 @@ The behavior associated with the [`CarV1Header.roots`](https://ipld.io/specs/transport/car/carv1/#header) field is not currently specified. -Clients MAY ignore it. +The lack of standard here means a client MUST assume different Gateways could return a different value. + +A Client SHOULD ignore this field. :::issue From 0b1d0e2c04501244d057816f6a7c3b5ec553c335 Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Fri, 4 Aug 2023 15:23:46 +0200 Subject: [PATCH 12/12] ipip-412: add fixtures and final editorials --- src/http-gateways/trustless-gateway.md | 18 +++++------ src/ipips/ipip-0412.md | 42 ++++++++++++-------------- 2 files changed, 29 insertions(+), 31 deletions(-) diff --git a/src/http-gateways/trustless-gateway.md b/src/http-gateways/trustless-gateway.md index 5f9c5d141..949e2b0bf 100644 --- a/src/http-gateways/trustless-gateway.md +++ b/src/http-gateways/trustless-gateway.md @@ -67,15 +67,15 @@ Same as in :cite[path-gateway], but with limited number of supported response ty ### `Accept` (request header) -A Client SHOULD sent this HTTP header to leverage content type negotiation +A Client SHOULD send this HTTP header to leverage content type negotiation based on section 12.5.1 of :cite[rfc9110]. -Below response types MUST to be supported: +Below response types MUST be supported: - [application/vnd.ipld.raw](https://www.iana.org/assignments/media-types/application/vnd.ipld.raw) - A single, verifiable raw block to be returned. -Below response types SHOULD to be supported: +Below response types SHOULD be supported: - [application/vnd.ipld.car](https://www.iana.org/assignments/media-types/application/vnd.ipld.car) - Disables IPLD/IPFS deserialization, requests a verifiable CAR stream to be @@ -86,7 +86,7 @@ Below response types SHOULD to be supported: - A verifiable :cite[ipns-record] (multicodec `0x0300`). A Gateway SHOULD return HTTP 400 Bad Request when running in strict trustless -mode (no deserialized responses) and `Accept` header is missing. +mode (no deserialized responses) and `Accept` header is missing. ## Request Query Parameters @@ -126,7 +126,7 @@ When the terminating entity at the end of the specified content path: specified byte range of that entity. - When dealing with a sharded UnixFS file (`dag-pb`, `0x70`) and a non-zero - `from` value, the UnixFS data and `blocksizes` determine the + `from` value, the UnixFS data and `blocksizes` determine the corresponding starting block for a given `from` offset. - cannot be interpreted as a continuous array of bytes (such as a DAG-CBOR/JSON @@ -163,14 +163,14 @@ that includes enough blocks for the client to understand why the requested returned: - If the requested `entity-bytes` resolves to a range that partially falls - outside of the entity's byte range, the response MUST include the subset of + outside the entity's byte range, the response MUST include the subset of blocks within the entity's bytes. - This allows clients to request valid ranges of the entity without needing to know its total size beforehand, and it does not require the Gateway to buffer the entire entity before returning the response. - If the requested `entity-bytes` resolves to a zero-length range or falls - fully outside of the entity's bytes, the response is equivalent to + fully outside the entity's bytes, the response is equivalent to `dag-scope=block`. - This allows client to produce a meaningful error (e.g, in case of UnixFS, leverage `Data.blocksizes` information present in the root `dag-pb` block). @@ -366,8 +366,8 @@ Gateway implementations SHOULD decide on the implicit default ordering or other parameters, and use it in responses when client did not explicitly specify any matching preference. -A Gateway MAY choose to implement only some of the parameters and return HTTP -400 Bad Request or 406 Not Acceptable when client requested a response with +A Gateway MAY choose to implement only some parameters and return HTTP +400 Bad Request or 406 Not Acceptable when a client requested a response with unsupported content type variant. A Client MUST verify `Content-Type` returned with CAR response before diff --git a/src/ipips/ipip-0412.md b/src/ipips/ipip-0412.md index b8bd5cc00..93f96314c 100644 --- a/src/ipips/ipip-0412.md +++ b/src/ipips/ipip-0412.md @@ -1,13 +1,19 @@ --- title: "IPIP-0412: Signaling Block Order in CARs on HTTP Gateways" date: 2023-05-15 -ipip: proposal +ipip: ratified editors: - name: Marcin Rataj github: lidel url: https://lidel.org/ + affiliation: + name: Protocol Labs + url: https://protocol.ai/ - name: Jorropo github: Jorropo + affiliation: + name: Protocol Labs + url: https://protocol.ai/ relatedIssues: - https://github.com/ipfs/specs/issues/348 - https://github.com/ipfs/specs/pull/330 @@ -29,7 +35,7 @@ We want to make it easier to build light-clients for IPFS. We want them to have low memory footprints on arbitrary sized files. The main pain point preventing this is the fact that CAR ordering isn't specified. -This require to keeping some kind of reference either on disk, or in memory to +This requires keeping some kind of reference either on disk, or in memory to previously seen blocks for two reasons. 1. Blocks can arrive out of order, meaning when a block is consumed (data is @@ -52,7 +58,7 @@ This IPIP aims to improve the status quo. CAR content type ([`application/vnd.ipld.car`](https://www.iana.org/assignments/media-types/application/vnd.ipld.car)) already supports `version` parameter, which allows gateway to indicate which -CAR flavour is returned with the response. +CAR flavor is returned with the response. The proposed solution introduces two new parameters for the content type headers in HTTP requests and responses: `order` and `dups`. @@ -61,7 +67,7 @@ The `order` parameter allows the client to indicate its preference for a specific block order in the CAR response, and the `dups` parameter specifies whether duplicate blocks are allowed in the response. -A Client SHOULD sent `Accept` HTTP header to leverage content type negotiation +A Client SHOULD send `Accept` HTTP header to leverage content type negotiation based on section 12.5.1 of :cite[rfc9110] to get the preferred response type. More details in Section 5. (CAR Responses) of :cite[trustless-gateway]. @@ -184,24 +190,16 @@ between clients and Trustless Gateways using various combinations of `order` and Relevant tests were added to [gateway-conformance](https://github.com/ipfs/gateway-conformance) test suite -in [#87](https://github.com/ipfs/gateway-conformance/pull/87). - - - -Below are CIDs, CARs, and short summary of each fixture. - -TODO: -1. a CAR with blocks for a small file in DFS order -2. a CAR with blocks for a small file with one block appearing twice - -Tests for duplicates use a fixture where a directory contains two files that -are the same. If `dups=n`, then there are no duplicates. If `dups=y`, then the -blocks of the file are sent twice, by the order they show up in the DAG. - -The same fixture is used for testing `order=dfs`. +in [#87](https://github.com/ipfs/gateway-conformance/pull/87), and include the below fixture. + +- `bafybeihchr7vmgjaasntayyatmp5sv6xza57iy2h4xj7g46bpjij6yhrmy` + ([CAR](https://github.com/ipfs/gateway-conformance/raw/v0.3.0/fixtures/trustless_gateway_car/dir-with-duplicate-files.car)) + - An UnixFS directory with two files that are the same (same CID). + - If `dups=n`, then there should be no duplicate blocks in the returned CAR. + - If `dups=y`, then the blocks of the file are sent twice. + - The same fixture can be used for testing `order=dfs` and checking if blocks that belong to files arrive in the DFS order. + - It is encouraged to also test DFS order with HAMT fixture such as `bafybeidbclfqleg2uojchspzd4bob56dqetqjsj27gy2cq3klkkgxtpn4i` + ([CAR](https://github.com/ipfs/gateway-conformance/raw/v0.3.0/fixtures/trustless_gateway_car/single-layer-hamt-with-multi-block-files.car)) ### Copyright