Skip to content
This repository has been archived by the owner on Mar 29, 2023. It is now read-only.

Multipart iterator consumes its internal reader #53

Closed
schomatis opened this issue Jul 26, 2022 · 3 comments · Fixed by #54
Closed

Multipart iterator consumes its internal reader #53

schomatis opened this issue Jul 26, 2022 · 3 comments · Fixed by #54
Assignees
Labels
kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization

Comments

@schomatis
Copy link
Contributor

multipartDirectory iterator Entries() directly access its internal state (instead of creating its own)

func (f *multipartDirectory) Entries() DirIterator {
return &multipartIterator{f: f}
}

and consumes from the directory's internal multipart.Reader

var part *multipart.Part
for {
part, it.err = it.f.walker.getPart()

hence we can only iterate the directory once.

This is not the expected behavior of a directory iterator. That said, in practice we normally iterate inputs once in go-ipfs, and I'm targeting an exception for ipfs/kubo#8927. Acknowledging this is not a serious bug my question then would be: which path to choose from?

a. Leave it be and create a special case for this directory type downstream.
b. Attempt to fix this, if possible without adding much more complexity. (I'd want to avoid copying everything to our own buffer and iterating that.)

@schomatis schomatis added kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization labels Jul 26, 2022
@welcome
Copy link

welcome bot commented Jul 26, 2022

Thank you for submitting your first issue to this repository! A maintainer will be here shortly to triage and review.
In the meantime, please double-check that you have provided all the necessary information to make this process easy! Any information that can help save additional round trips is useful! We currently aim to give initial feedback within two business days. If this does not happen, feel free to leave a comment.
Please keep an eye on how this issue will be labeled, as labels give an overview of priorities, assignments and additional actions requested by the maintainers:

  • "Priority" labels will show how urgent this is for the team.
  • "Status" labels will show if this is ready to be worked on, blocked, or in progress.
  • "Need" labels will indicate if additional input or analysis is required.

Finally, remember to use https://discuss.ipfs.io if you just need general support.

@schomatis
Copy link
Contributor Author

From @lidel:

the described state of things seems to be a 5 year old, probably used all over Filecoin (Venus, Lotus) by now. You could try fixing upstream, but if it ends up being a breaking or performance impacting change, may be safer to leave it and handle it in Kubo instead.

My plan is:

  • If I can extract the files from the multipart reader without consuming them in the process I can move the multipartDirectory to a SliceFile (meaning slice directory) and return that implementation under the Directory abstraction without breaking usage nor performance. From there we have a sane iterator that can be instantiated many times to review file paths (without reading the files themselves).
  • If the multipart parts need to be read entirely to be extracted then I'll leave this and I'll just create a special case for multipart downstream or adapt feat(cmds/add): --to-files option as files cp kubo#8927 as needed.

@schomatis schomatis self-assigned this Aug 1, 2022
@schomatis
Copy link
Contributor Author

Reading Go's multipart library I don't see an easy way to do the above so I'll fix this downstream.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization
Projects
No open projects
Archived in project
Development

Successfully merging a pull request may close this issue.

1 participant