Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Garbage collection removes filestore/urlstore references #5765

Closed
whyrusleeping opened this issue Nov 11, 2018 · 18 comments
Closed

Garbage collection removes filestore/urlstore references #5765

whyrusleeping opened this issue Nov 11, 2018 · 18 comments
Labels
kind/enhancement A net-new feature or improvement to an existing feature

Comments

@whyrusleeping
Copy link
Member

There should be an option for garbage collection to not collect filestore/urlstore references, theyre pretty small, and some users would probably want to keep that around while clearing up the other cached content.

One caveat here is that if we have the intermediate nodes of something added via the filestore, if we're not garbage collecting the filestore stuff, we shouldnt be garbage collecting those intermediate nodes.

@whyrusleeping
Copy link
Member Author

Also, maybe tangential, but seems like it could help: we should keep a mapping of the urls/filepaths we've added -> the resultant hashes. This would make using the urlstore a bit easier for some users (makes it easier to check if ipfs already knows about a given url) and also makes the above 'pinning' of filestore stuff easier.

@parkan
Copy link

parkan commented Nov 13, 2018

does this happen because pins are at the block level, so if you're not storing any blocks there's nothing to pin and the object gets GC'd? that sort of makes sense but definitely isn't the behavior I'd expect

if this is hard to fix we should at minimum update the docs, e.g. --nocopy should also say Implies --pin false

@kevina kevina added the kind/enhancement A net-new feature or improvement to an existing feature label Nov 30, 2018
@mitra42
Copy link

mitra42 commented Dec 2, 2018

Any ideas on fixing this @whyrusleeping - since you found the code where the problem was happening, I'd hoped it would be a quick fix.

@Stebalien
Copy link
Member

I think there's a bit of a disconnect here. This is issue is about an enhancement, not a bug. The issue here is: Filestore references are tiny, we (optionally) shouldn't GC them even if they're not pinned because keeping them won't really waste much space.

This issue isn't about us garbage collecting pinned content (with or without the filestore). If you're seeing pinned content getting GCed, that's a bug; please file a new report.

@Stebalien
Copy link
Member

Hm. Reading the linked conversation, it looks like you and @whyrusleeping may also have been talking past each-other. Are you pinning these files?

@mitra42
Copy link

mitra42 commented Dec 11, 2018

@whyrusleeping said that urlstore automatically pinned things, and I believe his analysis was that this wasn't the problem.

Our problem is simple ... we have a process for adding a URL via the HTTP API to a local IPFS go instance (with arguments as provided by @parkan)
/api/v0/urlstore/add?arg:URL&trickle:true
which returns a IPFS hash that we then store and share in our metadata as an alternative to access to http to access the file.

Then at some random point in the future all these references become invalid.

The comments above about GC, Intermediate blocks, and Pinning are just analysis by various people as to WHY the bug is happening.

By the way ... this bug is serious enough that when you combine IPFS losing data with being unable to tell what has been lost; and clients not getting errors when trying to retrieve has meant we've had to turn IPFS off for most purposes because there is no way to reliably use urlstore and retrieve the references stored.

@Stebalien
Copy link
Member

said that urlstore automatically pinned things, and I believe his analysis was that this wasn't the problem.

Looks like that analysis was incorrect. ipfs add --nocopy pins but ipfs urlstore add doesn't (see the ipfs urlstore add --help). However, it really should pin by default:

@mitra42
Copy link

mitra42 commented Dec 11, 2018

And to be clear ...I'm assuming that "pinning" in this case it means that urlstore will keep the reference, and not that urlstore will make a copy (and pin) the content.

@parkan
Copy link

parkan commented Dec 11, 2018

thanks @Stebalien! my head kind of hurts after trying to understand the semantics of the various node types even w/o filestore/urlstore, so I cannot be super helpful here

the only thing that gives me pause w.r.t. your diagnosis is that the repo in this case appears to sometimes lose normally added items (thumbnail images) as well, but maybe these are two separate issues

@mitra42 yes, pinning would mean that the reference is retained (i.e. the "repo loses items" problem stops, at least for urlstore objects)

@Stebalien
Copy link
Member

And to be clear ...I'm assuming that "pinning" in this case it means that urlstore will keep the reference, and not that urlstore will make a copy (and pin) the content.

Yes. When adding urlstore/filestore content, we create "fake" pointer blocks and "pin" those.

the only thing that gives me pause w.r.t. your diagnosis is that the repo in this case appears to sometimes lose normally added items (thumbnail images) as well, but maybe these are two separate issues

Are these being added via a simple ipfs add? If so, that's a different issue.

@mitra42
Copy link

mitra42 commented Dec 11, 2018

If we can fix the urlstore items getting lost then we might be able to diagnose them separately. That could have been a different issue, and might be fixed now.

@whyrusleeping
Copy link
Member Author

Let's make sure to test our assumptions here. Once we merge this, and get the archive node running it, we should try pulling a few things through the urlstore, and then explicitly running a gc and making sure that doesnt cause trouble.

One other potential issue is if @mitra42 is using the http api, then maybe we should be explicitly passing the pin=true option.

@Stebalien
Copy link
Member

One other potential issue is if @mitra42 is using the http api, then maybe we should be explicitly passing the pin=true option.

This shouldn't be necessary. If it is, that's another bug.

@mitra42
Copy link

mitra42 commented Dec 11, 2018

I'm more than happy to add pin=true to the Http API if that is what is required ... but people keep saying it shouldnt be necessary.

@Stebalien
Copy link
Member

It's not necessary, I've just checked.

@mitra42
Copy link

mitra42 commented Dec 12, 2018

You mean with the fix in the PR commented above ? If so then do you have any idea how long before that fix hits a new stable version?

@Stebalien
Copy link
Member

Yes. Currently, the only way to do this is to use ipfs urlstore add followed by an ipfs pin (hopefully without a GC in between...).

If so then do you have any idea how long before that fix hits a new stable version?

No idea. I'd like to get some of our bitswap improvements in before the next release, if possible, but we won't hold the release for them.

@Stebalien
Copy link
Member

Fixed in 0.4.19.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement A net-new feature or improvement to an existing feature
Projects
None yet
Development

No branches or pull requests

5 participants