Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate sharded directory structure in ipfs ls #8196

Open
Stebalien opened this issue Jun 16, 2021 · 2 comments
Open

Validate sharded directory structure in ipfs ls #8196

Stebalien opened this issue Jun 16, 2021 · 2 comments
Labels
kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization

Comments

@Stebalien
Copy link
Member

Currently, when ipfs ls lists a sharded directory, it naively walks the dag without actually validating the internal structure. This means ipfs ls /ipfs/QmFoo might list some file named "bar", and ipfs get /ipfs/QmFoo/bar might then fail (because the directory was malformed).

We should be validating this structure as we traverse it.

@Stebalien Stebalien added kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization labels Jun 16, 2021
@schomatis schomatis self-assigned this Dec 23, 2021
@schomatis
Copy link
Contributor

@Stebalien

it naively walks the dag without actually validating the internal structure

(From #8072)

The real bug here is that we don't verify the HAMT structure when listing, we just blindly walk the DAG.

I'm having trouble identifying in the code what exactly do these statements mean and what would verifying the directory entail. We normally call EnumLinksAsync on the dir which would seem implies knowing (and validating?) how the HAMT operates. The only possible scenario that I'm finding for this issue (but please point me to a concrete example if there is one) is having a HAMT directory with an incorrect UnixFS format, for example:

  1. A HAMT directory incorrectly tagged as a Basic one. This indeed would list all links (actual directory entries but also intermediate HAMT shard nodes) in the root DAG node possibly giving the incorrect behavior from the cited issue. ( (*BasicDirectory).EnumLinksAsync() just list all links and has no validation as there is no AFAIK incorrect format for a basic dir link.)
  2. A HAMT directory incorrectly tagged as not a directory, which would follow a similar path of listing DAG links blindly.

In both cases I'm not sure how to recognize the incorrectly tagged (maybe corrupted) HAMT directory as such and avoid the above behavior, but maybe I'm misunderstanding the issue and need a concrete example. I'm having trouble finding the example directory /ipfs/QmUygZRt3uF4gco8Ff3qmRa9xpYZsodhijPPVD2XmubBLr/ of the original issue (getting 504 timeouts).

@BigLep BigLep added this to the Best Effort Track milestone Mar 5, 2022
@BigLep
Copy link
Contributor

BigLep commented Mar 11, 2022

2022-03-11 conversation: we're less keen on working items like this because of needing to do it twice (go-ipld-prime, legacy unixfs code).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization
Projects
No open projects
Status: 🥞 Todo
Development

No branches or pull requests

3 participants