Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass cids instead of nodes around in EnumerateChildrenAsync #3598

Merged
merged 2 commits into from
Feb 14, 2017

Conversation

Voker57
Copy link
Contributor

@Voker57 Voker57 commented Jan 14, 2017

Saves a whole lot of RAM on todobuffer

@Voker57
Copy link
Contributor Author

Voker57 commented Jan 15, 2017

Fixes #3588

}

if next == nil {
next = nc.Node
send = feed
if len(n.Links()) > 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be cleaned up a little bit, Lets add all the new links to the array first, then check for next being nil and handle that case separately.

Copy link
Member

@whyrusleeping whyrusleeping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this approach better than mine. One small comment, and then i'd also like @Kubuxu to review.

@whyrusleeping whyrusleeping added this to the ipfs 0.4.6 milestone Jan 17, 2017
send = feed
if len(n.Links()) > 0 {
next = n.Links()[0].Cid
for _, l := range n.Links()[1:] {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be beneficial to: preallocate array for Cids in .Links() call, write the Cids to that array and do just one append call.

This way the todobuffer will be expanded just once instead of N times.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general this isnt an issue in go, slices will optmistically allocate extra space under the hood for you, so sequential appends are generally just as performant as preallocating.

} else {
todobuffer = append(todobuffer, nc.Node)
for _, l := range n.Links() {
todobuffer = append(todobuffer, l.Cid)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same (https://github.com/ipfs/go-ipfs/pull/3598/files#r96632953) applies here.

The allocation of the additional array will probably be removed by optimization step in golang.

@Voker57 Voker57 force-pushed the feat/frugal-enumerate branch 4 times, most recently from c978321 to 71da359 Compare January 19, 2017 12:39
@Voker57
Copy link
Contributor Author

Voker57 commented Jan 19, 2017

Addressed both reviews.

@Kubuxu
Copy link
Member

Kubuxu commented Jan 19, 2017

Can you also change the base of the PR to master, as the branch you are currently pointing to was merged already.

@Voker57 Voker57 changed the base branch from feat/better-enum-async to master January 20, 2017 17:37
@Voker57
Copy link
Contributor Author

Voker57 commented Jan 20, 2017

done

@Voker57 Voker57 mentioned this pull request Jan 28, 2017
feed := make(chan node.Node)
out := make(chan *NodeOption)
feed := make(chan *cid.Cid)
out := make(chan *cid.Cid)
done := make(chan struct{})

var setlk sync.Mutex

for i := 0; i < FetchGraphConcurrency; i++ {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we spawning multiple of these goroutines? Looking at it, we should only need one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean FetchGraphConcurrency ones? You're right, looks like I accidentally removed concurrency altogether :) Fixed version incoming

todobuffer = append(todobuffer, nc.Node)
links := n.Links()
if len(links) > 0 {
cids := make([]*cid.Cid, len(links), len(links))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to specify capacity if youre already specifying length to be the same

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, you can rewrite this whole loop to avoid allocating an extra slice. Just append to the todobuffer

@Voker57
Copy link
Contributor Author

Voker57 commented Feb 13, 2017

Modified PR to actually be concurrent. Is second changes' request obsolete now?


if unseen {
cids := make([]*cid.Cid, len(n.Links()))
for i, l := range n.Links() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unecessary, why not just send the node down through out?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my worry is that its an extra allocation per node that we could avoid by simply moving this for loop down into the processing below.

send = feed
} else {
todobuffer = append(todobuffer, nc.Node)
case cids := <-out:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if out was changed to be a channel of nodes, this could look like:

case nd := <-out:
  for _, lnk := range nd.Links() {
    if next == nil {
      next = lnk.Cid
      send = feed
      continue
    }
    todobuffer = append(todobuffer, lnk.Cid)
  }



errChan := make(chan error)
fetchersCtx, cancel := context.WithCancel(ctx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put a defer cancel() after this call and then remove the other call youre making to cancel

License: MIT
Signed-off-by: Iaroslav Gridin <voker57@gmail.com>
License: MIT
Signed-off-by: Iaroslav Gridin <voker57@gmail.com>
@Voker57
Copy link
Contributor Author

Voker57 commented Feb 14, 2017

Implemented suggested changes.

Copy link
Member

@whyrusleeping whyrusleeping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Tested locally and can confirm its much faster already :)

@whyrusleeping whyrusleeping merged commit a2f27e1 into ipfs:master Feb 14, 2017
@Voker57 Voker57 deleted the feat/frugal-enumerate branch February 14, 2017 23:11
@Voker57 Voker57 restored the feat/frugal-enumerate branch November 1, 2017 14:06
Stebalien added a commit that referenced this pull request Mar 27, 2018
No idea why this was changed this was introduced in:

08f342e (part of #3598)

License: MIT
Signed-off-by: Steven Allen <steven@stebalien.com>
@Voker57 Voker57 deleted the feat/frugal-enumerate branch June 4, 2019 08:41
ariescodescream pushed a commit to ariescodescream/go-ipfs that referenced this pull request Apr 7, 2022
Pass cids instead of nodes around in EnumerateChildrenAsync
ariescodescream pushed a commit to ariescodescream/go-ipfs that referenced this pull request Apr 7, 2022
No idea why this was changed this was introduced in:

08f342e (part of ipfs#3598)

License: MIT
Signed-off-by: Steven Allen <steven@stebalien.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants