diff --git a/docs/.vuepress/config.js b/docs/.vuepress/config.js index f50bb6e39..08485dd16 100644 --- a/docs/.vuepress/config.js +++ b/docs/.vuepress/config.js @@ -221,6 +221,15 @@ module.exports = { '/how-to/publish-ipns' ] }, + { + title: 'IPFS Gateway', + sidebarDepth: 1, + collapsable: true, + children: [ + '/how-to/gateway-best-practices', + '/how-to/gateway-troubleshooting' + ] + }, { title: 'IPFS Companion', sidebarDepth: 1, diff --git a/docs/concepts/ipfs-gateway.md b/docs/concepts/ipfs-gateway.md index 6f42ac549..73bc7d234 100644 --- a/docs/concepts/ipfs-gateway.md +++ b/docs/concepts/ipfs-gateway.md @@ -11,28 +11,46 @@ related: # IPFS Gateway -This document discusses: +An _IPFS gateway_ provides an HTTP-based service that allows HTTP-incompatible browsers, tools and software to access IPFS content. For example, some browsers or tools like [Curl](https://curl.haxx.se/) or [Wget](https://www.gnu.org/software/wget/) don't support IPFS natively and cannot access to IPFS content using canonical addressing like `ipfs://{CID}/{optional path to resource}`. While tools like [IPFS Companion](https://github.com/ipfs-shipyard/ipfs-companion) add browser support for native IPFS URLs, this is not always an option. As such, there are multiple gateway types and gateway providers available so that applications of all kinds can interface with IPFS using HTTP. +This page discusses: + +- The IPFS gateway request lifecycle - The several types of gateways. - Gateway role in the use of IPFS. -- Appropriate situations for the use of gateways. -- Situations when you should avoid the use of gateways. -- Implementation guidelines. -You should read this document if you want to: +## Gateway request lifecycle -- Understand, at a conceptual level, how gateways fit into the overall use of IPFS. -- Decide whether and what type of gateways to employ for your use case. -- Understand, at a conceptual level, how to deploy gateways for your use case. +:::callout +This section uses the _default_ gateway request lifecycle of [IPFS Kubo](https://github.com/ipfs/kubo) to introduce the basic concepts in the lifecycle. However, some gateways only serve content that they have and/or want to provide. For example, a Kubo gateway with `NoFetch` enabled will not attempt to retrieve content from the network. +::: -## Overview +When a client request for a CID reaches an IPFS gateway, the gateway first checks whether the CID is cached locally. At this point, one of the following occurs: -IPFS deployment seeks to include native support of IPFS in all popular browsers and tools. Gateways provide workarounds for applications that do not yet support IPFS natively. For example, errors occur when a browser that does not support IPFS attempts access to IPFS content in the canonical form of `ipfs://{CID}/{optional path to resource}`. Other tools that rely solely on HTTP encounter similar errors in accessing IPFS content in canonical form, such as [Curl](https://curl.haxx.se/) and [Wget](https://www.gnu.org/software/wget/). +- **If the CID is cached locally**, the gateway responds with the content referred to by the CID, and the lifecycle is complete. -Tools like [IPFS Companion](https://github.com/ipfs-shipyard/ipfs-companion) resolve these content access errors. However, not every user has permission to alter — or be capable of altering — their computer configuration. IPFS gateways provide an HTTP-based service that allows IPFS-ignorant browsers and tools to access IPFS content. +- **If the CID is not in the local cache**, the gateway will attempt to retrieve it from the network. -## Gateway providers +The CID retrieval process is composed of two parts, content discovery / routing and content retrieval: + +1. In the **content discovery / routing** step, the gateway will determine provider location; that is, _where_ the data specified by the CID can be found: + + - Asking peers that it is directly connected to if they have the data specified by the CID. + - Query the DHT for the IDs and network addresses of peers that have the data specified by the CID. + +2. Next, the gateway performs **content retrieval**, which can be broken into the following steps: + + 1. The gateway connects to the provider. + 1. The gateway fetches the CIDs content. + 1. The gateway streams the content to the client. +:::callout +- Learn more about content discovery, routing, retrieval and the subsystems involved in each part of the process in [How IPFS works](./how-ipfs-works.md). +- Dive into the technical specifications for gateways in the [IPFS HTTP Gateways specification](https://specs.ipfs.tech/http-gateways/) page. +::: + +## Gateway providers + Regardless of who deploys a gateway and where, any IPFS gateway resolves access to any requested IPFS [content identifier](content-addressing.md). Therefore, for best performance, when you need the service of a gateway, you should use the one closest to you. ### Your local gateway @@ -50,26 +68,26 @@ A gateway behind a firewall represents just one potential location for a private Public gateway operators include: - Protocol Labs, which deploys the public gateway `https://ipfs.io`. -- Third-party public gateways. E.g., `https://cf-ipfs.com`. +- Third-party public gateways, such as `https://cf-ipfs.com`. Protocol Labs maintains a [list of public gateways](https://ipfs.github.io/public-gateway-checker/) and their status. -![A list of public gateways and their status, available on IPFS](./images/ipfs-gateways/public-gateway-checker.png) - ## Gateway types -Categorizing gateways involves several dimensions: +:::warning +[Path resolution style gateways](#path) do not provide origin isolation. +::: -- [Read/write support](#read-only-and-writeable-gateways) +There are multiple gateway types, each with specific use case, security, performance, and functional implications. + +- [Read support](#read-only-gateways) - [Authentication support](#authenticated-gateways) - [Resolution style](#resolution-style) - [Service](#gateway-services) -Choosing the form of gateway usage has security, performance, and other functional implications. - -### Read-only and writeable gateways +### Read-only gateways -The examples discussed in the earlier sections above illustrated the use of read-only HTTP gateways to fetch content from IPFS via an HTTP GET method. _Writeable_ HTTP gateways also support `POST`, `PUT`, and `DELETE` methods. +_Read-only gateways_ are the simplest kind of gateway. This gateway type provides a way to fetch IPFS content using the HTTP GET method. ### Authenticated gateways @@ -139,98 +157,13 @@ Currently HTTP gateways may access both IPFS and IPNS services: | IPNS | subdomain | `https://{IPNS identifier}.ipns.{gatewayURL}/{optional path to resource}` | | IPNS | DNSLink | Useful when IPNS identifier is a domain:
`https://{example.com}/{optional path to resource}` **preferred**, or
`https://{gateway URL}/ipns/{example.com}/{optional path to resource}` | -### Which type to use - -The preferred form of gateway access varies depending on the nature of the targeted content. - -| Target | Preferred gateway type | Canonical form of access
features & considerations | -| ----------------------------------------------- | ---------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| Current version of
potentially mutable root | IPNS subdomain | `https://{IPNS identifier}.ipns.{gatewayURL}/{optional path to resource}`
+ supports cross-origin security
+ supports cross-origin resource sharing
+ suitable for both domain IPNS names (`{domain.tld}`) and hash IPNS names | -| | IPFS DNSLink | `https://{example.com}/{optional path to resource}`
+ supports cross-origin security
+ supports cross-origin resource sharing
– requires DNS update to propagate change to root content
• DNSLink, not user/app, specifies the gateway to use, opening up potential gateway trust and congestion issues | -| Immutable root or
content | IPFS subdomain | `https://{CID}.ipfs.{gatewayURL}/{optional path to resource}`
+ supports cross-origin security
+ supports cross-origin resource sharing | - -Any form of gateway provides a bridge for apps without native support of IPFS. Better performance and security results from native IPFS implementation within an app. - -## When not to use a gateway - -### Delay-sensitive applications - -Any gateway introduces a delay in completing desired actions because the gateway acts as an intermediary between the source of the request and the IPFS node or nodes capable of returning the desired content. If the serving gateway cached the requested content earlier (e.g., due to previous requests), then the cache eliminates this delay. - -Overuse of a gateway also introduces delays due to queuing of requests. - -When dealing with delay-sensitive processes, you should aim to use a native IPFS node within the app (fastest), or as a local service daemon (almost as fast). Failing that, use a gateway installed as a local service. Note that when an IPFS node runs locally, it includes a gateway at `http://127.0.0.1:8080`. - -All time-insensitive processes can be routed through public/private gateways. - -### End-to-end cryptographic validation required - -Because of third-party gateway vulnerabilities, apps requiring end-to-end validation of content read/write should avoid gateways when possible. If the app must employ an external gateway, such apps should use `ipfs.io` or a trusted third-party. - -## Limitations and potential workarounds - -### Centralization - -Use of a gateway requires location-based addressing: `https://{gatewayURL}/ipfs/{CID}/{etc}` All too easily, the gateway URL can become the handle by which users identify the content; i.e., the uniform reference locator (URL) equates (improperly) to the uniform reference identifier (URI). Now imagine that the gateway goes offline or cannot be reached from a different user's location because of firewalls. At this moment, content improperly identified by that gateway-based URL also appears unreachable, defeating a key benefit of IPFS: decentralization. - -Similarly, the use of DNSLink resolution with `Alias` forces requests through the domain's chosen gateway, as specified in the `dnslink={value}` string within the DNS TXT record. If the specified gateway becomes overloaded, goes offline, or becomes compromised, all traffic with that content becomes deleted, disabled, or suspect. - -### Misplaced trust - -Trusting a specific gateway, in turn, requires you to trust the gateway's issuing Certificate Authorities and the security of the public key infrastructure employed by that gateway. Compromised certificate authorities or public-key infrastructure implementations may undermine the trustworthiness of the gateway. - -### Violation of same-origin policy - -To prevent one website from improperly accessing HTTP session data associated with a different website, the [same-origin policy](https://en.wikipedia.org/wiki/Same-origin_policy) permits script access only to pages that share a common domain name and port. - -Consider two web pages stored in IPFS: `ipfs://{CID A}/{webpage A}` and `ipfs://{CID B}/{webpage B}`. Code on `webpage A` should not access data from `webpage B`, as they do not share the same content ID (origin). - -A browser employing one gateway to access both sites, however, might not enforce that security policy. From that browser's perspective, both webpages share a common origin: the gateway as identified in the URL `https://{gatewayURL}/...`. - -The use of subdomain gateways avoids violating the same-origin policy. In this situation, the gateway's reference to the two webpages becomes: - -```bash -https://{CID A}.ipfs.{gatewayURL}/{webpage A} -https://{CID B}.ipfs.{gatewayURL}/{webpage B} -``` - -These pages do not share the same origin. Similarly, the use of DNSLink gateway avoids violating the same-origin policy. The [IPFS public gateway checker](https://ipfs.github.io/public-gateway-checker/) identifies those public gateways that avoid violating the same-origin policy. - -### Cross-origin resource sharing (CORS) - -[CORS](https://web.archive.org/web/20200418003728/https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS#The_HTTP_response_headers) allows a webpage to permit access to specified data by pages with a different origin. The [IPFS public gateway checker](https://ipfs.github.io/public-gateway-checker/) identifies those public gateways that support CORS. - -### Gateway man-in-the-middle vulnerability - -Employing a public or private HTTP gateway sacrifices end-to-end cryptographic validation of the delivery of the correct content. Consider the case of a browser fetching content with the URL `https://ExampleGateway.com/ipfs/{cid}`. A compromised `ExampleGateway.com` provides man-in-the-middle vulnerabilities, including: - -- Substituting false content in place of the actual content retrieved via the CID. -- Diverting a copy of the query and response, as well as the IP address of the querying browser, to a third party. - -A compromised writeable gateway may inject falsified content into the IPFS network, returning a CID which the user believes to refer to the true content. For example: - -1. Alice posts a balance of `123.54` to a compromised writable gateway. -1. The gateway is currently storing a balance of `0.00`, so it returns the CID of the falsified content to Alice. -1. Alice gives the falsified content CID to Bob. -1. Bob fetches the content with this CID and cryptographically validates the balance of `0.00`. - -To partially address this exposure, you may wish to use the public gateway [cf-ipfs.com](https://cf-ipfs.com) as an independent, trusted reference with both same-origin policy and CORS support. - -### Assumed filenames when downloading files - -When downloading files, browsers will usually guess a file's filename by looking at the last component of the path, e.g., `https://{domainName/tld}/{path}/userManual.pdf` downloads a file stored locally with the name `userManual.pdf`. Unfortunately, when linking directly to a file with no containing directory in IPFS, the CID becomes the final component. Storing the downloaded file with the filename set to the CID fails the human-friendly design test. - -To work around this issue, you can add a `?filename={filename.ext}` parameter to your query string to preemptively specify a name for the locally-stored downloaded file: +## Working with gateways -| Style | Query | -| --------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- | -| Path | `https://{gatewayURL}/ipfs/{CID}/{optional path to resource}?filename={filename.ext}` | -| Subdomain | `https://{CID}.ipfs.{gatewayURL}/{optional path to resource}?filename={filename.ext}` | -| DNSLink | `https://{example.com}/{optional path to resource}` or
`https://{gatewayURL}/ipns/{example.com}/{optional path to resource}?filename={filename.ext}` | +For more information on working with gateways, see [best practices](../how-to/gateway-best-practices.md) and [troubleshooting](../how-to/gateway-troubleshooting.md). -### Stale caches +## Implementing using the spec -A gateway may cache DNSLinks from DNS TXT records, which default to a one-hour lifetime. After content changes, cached DNSLinks continue to refer to the now-obsolete CID. To limit the delivery of obsolete cached content, the domain operator should change the DNS record's time-to-live parameter to a minute `60`. +If you would like to read the technical specifications for the various gateway types, and learn more about how to implement a gateway, see the [IPFS HTTP Gateways specification](https://specs.ipfs.tech/http-gateways/) page for more information. ## Frequently asked questions (FAQs) @@ -278,4 +211,4 @@ No. The ipfs.io gateway is one of many portals used to view content stored by th - [A Practical Explainer for IPFS Gateways – Part 1](https://blog.ipfs.tech/2022-06-09-practical-explainer-ipfs-gateways-1/), [Part 2](https://blog.ipfs.tech/2022-06-30-practical-explainer-ipfs-gateways-2/) - [Kubo: Gateway configuration options](https://github.com/ipfs/kubo/blob/master/docs/config.md#gateway) -- [Gateway specifications](https://github.com/ipfs/specs/blob/main/http-gateways/#readme) \ No newline at end of file +- [IPFS HTTP Gateways specification](https://specs.ipfs.tech/http-gateways/) \ No newline at end of file diff --git a/docs/how-to/gateway-best-practices.md b/docs/how-to/gateway-best-practices.md new file mode 100644 index 000000000..48221e48b --- /dev/null +++ b/docs/how-to/gateway-best-practices.md @@ -0,0 +1,108 @@ +--- +title: Best practices +description: Learn best practices for working with IPFS HTTP Gateways +--- + +# Best practices for HTTP Gateways + +Various best practices for the use of IPFS gateways are listed below. To learn more about the concepts behind IPFS gateways, including how they work, available providers, types and FAQs, see [IPFS Gateway](../concepts/ipfs-gateway.md). For troubleshooting information, see [Troubleshooting](./gateway-troubleshooting.md). + +## Selecting a gateway type to use + +The preferred form of gateway access varies depending on the nature of the targeted content. Learn more about each gateway type and how it works [here](../concepts/ipfs-gateway.md#gateway-types). + +| Target | Preferred gateway type | Canonical form of access
features & considerations | +| ----------------------------------------------- | ---------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| Current version of
potentially mutable root | IPNS subdomain | `https://{IPNS identifier}.ipns.{gatewayURL}/{optional path to resource}`
+ supports cross-origin security
+ supports cross-origin resource sharing
+ suitable for both domain IPNS names (`{domain.tld}`) and hash IPNS names | +| | IPFS DNSLink | `https://{example.com}/{optional path to resource}`
+ supports cross-origin security
+ supports cross-origin resource sharing
– requires DNS update to propagate change to root content
• DNSLink, not user/app, specifies the gateway to use, opening up potential gateway trust and congestion issues | +| Immutable root or
content | IPFS subdomain | `https://{CID}.ipfs.{gatewayURL}/{optional path to resource}`
+ supports cross-origin security
+ supports cross-origin resource sharing | + +Any form of gateway provides a bridge for apps without native support of IPFS. Better performance and security results from native IPFS implementation within an app. + +## Self-hosting a gateway + +If you are running an IPFS node that is also configured as an IPFS gateway, each of the tips below can help improve the discovery and retrievability of your CIDs. + +- Pin your CIDs to multiple IPFS nodes to ensure reliable availability and resilience to failures of nodes and network partitions. To make pinning easier, use the vendor-agnostic [Pinning Service OpenAPI Specification](https://ipfs.github.io/pinning-services-api-spec/) that is already [supported by many IPFS node implementations, client libraries, and existing pinning services](https://github.com/ipfs/pinning-services-api-spec#adoption). +- Use a custom domain that you control as your IPFS gateway for flexibility in implementing performance optimizations. You can do this using one of the following methods: + - Point a domain you control like `mydomain.ipfs.yourdomain.io` to a reverse proxy like nginx, which will proxy requests to a public gateway, allowing you to switch public gateways if there's downtime. + - Use a service like [Cloudflare workers](https://workers.cloudflare.com/) or [Fastly Compute@Edge](https://www.fastly.com/products/edge-compute) to implement a lightweight reverse proxy to a gateway. +- Set up [peering](./peering-with-content-providers.md) with the pinning services that pin your CIDs. +- Make sure that your node is publicly reachable. + - You can check the reachability of your node by running `ipfs id` and checking for the `/ipfs/kad/1.0.0` value in the list of protocols (or, in one command, by running `ipfs id | grep ipfs\/kad`). + - If your node is not reachable because you are behind NAT, see the [NAT configuration](https://docs.ipfs.tech/how-to/nat-configuration/#ipv6) docs. +- Ensure that you are correctly returning HTTP cache headers to the client if the IPFS gateway node is behind a reverse proxy. Pay extra attention to `Etag`, `Cache-Control`, and `Last-Modified headers`. Consider leveraging the list of CIDs in `X-Ipfs-Roots` for smarter HTTP caching strategies. +- Put a CDN like Cloudflare in front of the IPFS gateway. +- Consider enabling the [Accelerated DHT Client](https://github.com/ipfs/go-ipfs/blob/master/docs/experimental-features.md#accelerated-dht-client). +- Test and monitor your internet connection speed, with a tool like [Speedtest CLI](https://www.speedtest.net/apps/cli). +- Monitor disk I/O and make sure that no other processes are causing disk I/O bottlenecks with a tool like [iotop](https://linux.die.net/man/1/iotop) or [iostat](https://linux.die.net/man/1/iostat). + + + +## Avoiding centralization + +Use of a gateway requires location-based addressing: `https://{gatewayURL}/ipfs/{CID}/{etc}` All too easily, the gateway URL can become the handle by which users identify the content; i.e., the uniform reference locator (URL) equates (improperly) to the uniform reference identifier (URI). Now imagine that the gateway goes offline or cannot be reached from a different user's location because of firewalls. At this moment, content improperly identified by that gateway-based URL also appears unreachable, defeating a key benefit of IPFS: decentralization. + +Similarly, the use of DNSLink resolution with `Alias` forces requests through the domain's chosen gateway, as specified in the `dnslink={value}` string within the DNS TXT record. If the specified gateway becomes overloaded, goes offline, or becomes compromised, all traffic with that content becomes deleted, disabled, or suspect. + +## Use subdomain gateway resolution for origin isolation + +To prevent one website from improperly accessing HTTP session data associated with a different website, the [same-origin policy](https://en.wikipedia.org/wiki/Same-origin_policy) permits script access only to pages that share a common domain name and port. + +Consider two CIDs each representing a different website accessed with the path resolution style: + - `https://ipfs.io/{CID A}/{website A}` + - `https://ipfs.io/{CID B}/{website B}`. + +Because their origin (hostname and port) are the same, the same-origin policy does not apply. + +To ensure the security provided by the same-origin policy, use the subdomain gateway: +```bash +https://{CID A}.ipfs.{gatewayURL}/{website A} +https://{CID B}.ipfs.{gatewayURL}/{website B} + +A browser employing one gateway to access both sites, however, might not enforce that security policy. From that browser's perspective, both pages share a common origin: the gateway as identified in the URL `https://{gatewayURL}/...`. + +The use of subdomain gateways avoids violating the same-origin policy. In this situation, the gateway's reference to the two pages becomes: + +```bash +https://{CID A}.ipfs.{gatewayURL}/{webpage A} +https://{CID B}.ipfs.{gatewayURL}/{webpage B} +``` + +These pages do not share the same origin. Similarly, the use of DNSLink gateway avoids violating the same-origin policy. The [IPFS public gateway checker](https://ipfs.github.io/public-gateway-checker/) identifies those public gateways that avoid violating the same-origin policy. + +## Cross-origin resource sharing (CORS) + +[CORS](https://web.archive.org/web/20200418003728/https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS#The_HTTP_response_headers) allows a webpage to permit access to specified data by pages with a different origin. The [IPFS public gateway checker](https://ipfs.github.io/public-gateway-checker/) identifies those public gateways that support CORS. + +## Gateway man-in-the-middle vulnerability + +Employing a public or private HTTP gateway sacrifices end-to-end cryptographic validation of the delivery of the correct content. Consider the case of a browser fetching content with the URL `https://ExampleGateway.com/ipfs/{cid}`. A compromised `ExampleGateway.com` provides man-in-the-middle vulnerabilities, including: + +- Substituting false content in place of the actual content retrieved via the CID. +- Diverting a copy of the query and response, as well as the IP address of the querying browser, to a third party. + +A compromised writeable gateway may inject falsified content into the IPFS network, returning a CID which the user believes to refer to the true content. For example: + +1. Alice posts a balance of `123.54` to a compromised writable gateway. +1. The gateway is currently storing a balance of `0.00`, so it returns the CID of the falsified content to Alice. +1. Alice gives the falsified content CID to Bob. +1. Bob fetches the content with this CID and cryptographically validates the balance of `0.00`. + +To partially address this exposure, you may wish to use the public gateway [cf-ipfs.com](https://cf-ipfs.com) as an independent, trusted reference with both same-origin policy and CORS support. + +## Assumed filenames when downloading files + +When downloading files, browsers will usually guess a file's filename by looking at the last component of the path, e.g., `https://{domainName/tld}/{path}/userManual.pdf` downloads a file stored locally with the name `userManual.pdf`. Unfortunately, when linking directly to a file with no containing directory in IPFS, the CID becomes the final component. Storing the downloaded file with the filename set to the CID fails the human-friendly design test. + +To work around this issue, you can add a `?filename={filename.ext}` parameter to your query string to preemptively specify a name for the locally-stored downloaded file: + +| Style | Query | +| --------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- | +| Path | `https://{gatewayURL}/ipfs/{CID}/{optional path to resource}?filename={filename.ext}` | +| Subdomain | `https://{CID}.ipfs.{gatewayURL}/{optional path to resource}?filename={filename.ext}` | +| DNSLink | `https://{example.com}/{optional path to resource}` or
`https://{gatewayURL}/ipns/{example.com}/{optional path to resource}?filename={filename.ext}` | + +## Stale caches + +A gateway may cache DNSLinks from DNS TXT records, which default to a one-hour lifetime. After content changes, cached DNSLinks continue to refer to the now-obsolete CID. To limit the delivery of obsolete cached content, the domain operator should change the DNS record's time-to-live parameter to a minute `60`. diff --git a/docs/how-to/gateway-troubleshooting.md b/docs/how-to/gateway-troubleshooting.md new file mode 100644 index 000000000..455799391 --- /dev/null +++ b/docs/how-to/gateway-troubleshooting.md @@ -0,0 +1,144 @@ +--- +title: Troubleshooting +description: Learn how to troubleshoot common issues with IPFS HTTP Gateways +--- + +# Troubleshooting HTTP Gateways + +IPFS HTTP Gateways are an HTTP-based service allowing browsers, tools and software to retrieve IPFS content with HTTP. When using HTTP Gateways, developers may need to troubleshoot issues like a: + +- CID not being retrievable via public IPFS gateways. +- CID being slow to load. + +This page summarizes the different ways to troubleshoot common issues. To learn more about the concepts behind IPFS gateways, including how they work, available providers, types and FAQs, see [IPFS Gateway](../concepts/ipfs-gateway.md). + +## General advice + +In general, slow retrieval or timeouts while fetching a CID from an IPFS gateway is typically related to one of the following: + +- The gateway itself. +- The provider of the CID might be unreachable or down. +- You (or the provider) are not providing your CIDs to the IPFS network via the DHT or the network indexer, so it is not discoverable. +- Network latency between the client and the gateway, or the gateway and the provider. + +::: +When troubleshooting IPFS gateways, ensure that you are familiar with [how gateways work](../concepts/ipfs-gateway.md), as this will make the process quicker and easier. +::: + +To further narrow down the root cause, use one of the following methods: + +- If you want an automated, browser based tool that does the majority of the diagnosing and debugging for you, use [pl-diagnose](#debug-with-pl-diagnose). +- If you are running an IPFS Kubo node, you can [manually debug using kubo and IPFS check](#debug-manually). + +## Debug with pl-diagnose + +The pl-diagnose tool is a browser-based software application that automates a large part of the process described in [Debug manually](#debug-manually). Specifically, this tool can help you answer these questions: + +- Is a given piece of content, identified with a with a certain CID available on the IPFS network, and which peers does the DHT list as hosts for that content? +- Which addresses are listed in the DHT for a given IPFS node? +- Is an IPFS node accessible by other peers? +- Is specific content available from an IPFS node? + +To use the tool, do the following: + +1. Navigate to the [application page](https://pl-diagnose.on.fleek.co/#/diagnose/access-content). +1. In the **Backend URL** field, enter the address of the node you are trying to check. +1. In the menu, select from one of the options depending on your specific needs: + + - **Is my content on the DHT?** - Given a CID on the node you are checking, determine if is listed in the DHT. + - **Is my peer in the DHT?** - Given a public network address of a node, determine if the node is listed in the DHT. + - **Is my node accessible by other peers?** - Given a public network address of a node, determine if that node is reachable by peers. + - **Is my node serving the content?** - Determine if the node is actually serving the content. + + +## Debug manually + +This procedure assumes that you have the latest version of kubo installed. To debug manually: + +1. Open up a terminal window. + +1. Using kubo, determine if any peers are advertising the `` you are requesting: + + ```shell + ipfs routing findprovs + ``` + + **If providers are found in DHT**, their Peer IDs are returned. Example output: + + ``` + 12D3KooWChhhfGdB9GJy1GbhghAAKCUR99oCymMEVS4eUcEy67nt + 12D3KooWJkNYFckQGPdBF57kVCLdkqZb1ZmZXAphe9MZkSh16UfP + QmQzqxhK82kAmKvARFZSkUVS6fo9sySaiogAnx5EnZ6ZmC + 12D3KooWSH5uLrYe7XSFpmnQj1NCsoiGeKSRCV7T5xijpX2Po2aT + ``` + + In this case, complete the steps described in [Providers returned](#providers-returned). + + **If no providers were returned**, the cause of your problem might be content publishing. Complete the steps described in [No providers returned](#no-providers-returned). + +### Providers returned + +If providers were found in the DHT, do the following: + +1. In the terminal, retrieve the network addresses of one of the peers returned using its ``: + + ```shell + ipfs id -f '' + ``` + + Upon success, you'll see a list of addresses like: + + ``` + /ip4/145.40.90.155/tcp/4001/p2p/12D3KooWSH5uLrYe7XSFpmnQj1NCsoiGeKSRCV7T5xijpX2Po2aT + /ip4/145.40.90.155/tcp/4002/ws/p2p/12D3KooWSH5uLrYe7XSFpmnQj1NCsoiGeKSRCV7T5xijpX2Po2aT + ip6/2604:1380:45e1:2700::d/tcp/4001/p2p/12D3KooWSH5uLrYe7XSFpmnQj1NCsoiGeKSRCV7T5xijpX2Po2aT + /ip6/2604:1380:45e1:2700::d/tcp/4002/ws/p2p/12D3KooWSH5uLrYe7XSFpmnQj1NCsoiGeKSRCV7T5xijpX2Po2aT + ``` + +1. Note the returned addresses, as you'll use them in step 4. +1. Navigate to [IPFS Check](https://check.ipfs.network/). +1. Enter the following information: + - In the **CID** field, enter the `` you are requesting. + - In the **Multiaddr field**, enter one of the peer addresses noted in step 2. +1. Click **Run Test**. + + If the test is unsuccessful, complete the steps described in [No providers returned](#no-providers-returned). + +### No providers returned + +If no providers are returned, the issue may lie in the content publishing lifecycle, specifically _reprovider runs_, the continuous process in which a node advertises provider records. _Provider records_ are mappings of CIDs to network addresses, and have an expiration time of 24 hours, which accounts for provider churn. Generally speaking, as more files are added to an IPFS node, the longer reprovide runs take. When a reprovide run takes longer than 24 hours (the expiration time for provider records), CIDs will no longer be discoverable. + +::: +You can learn more about the content publishing lifecycle in [How IPFS works](../concepts/how-ipfs-works.md). +::: + +With this in mind, if no providers are returned, do the following: + +1. First, determine how long a reprovide run takes: + + ```shell + ipfs stats provide + ``` + + The output should look something like: + + ```shell + TotalProvides: 7k (7,401) + AvgProvideDuration: 271.271ms + LastReprovideDuration: 13m16.104781s + LastReprovideBatchSize: 1k (1,858) + ``` + +1. Note the value for `LastReprovideDuration`. If it is close to 24 hours, select one of the following options, keeping in mind that each has tradeoffs: + + - **Enable the [Accelerated DHT Client](https://github.com/ipfs/go-ipfs/blob/master/docs/experimental-features.md#accelerated-dht-client) in Kubo**. This configuration improves content publishing times significantly by maintaining more connections to peers and a larger routing table and batching advertising of provider records. However, this performance boost comes at the cost of increased resource consumption. + + - **Change the reprovider strategy from `all` to either `pinned` or `roots`.** In both cases, only provider records for explicitly pinned content are advertised. Differences and tradeoffs are noted below: + - The `pinned` strategy will advertise both the root CIDs and child block CIDs (the entire DAG) of explicitly pinned content. + - The `roots` strategy will only advertise the root CIDs of pinned content, reducing the total number of provides in each run. This strategy is the most efficient, but should be done with caution, as it will limit discoverability only to root CIDs. In other words, if you are adding folders of files to an IPFS node, only the CID for the pinned folder will be advertised. All the blocks will still be retrievable with Bitswap once a connection to the node is established. + +1. Manually trigger a reprovide run: + + ```shell + ipfs bitswap reprovide + ``` \ No newline at end of file