Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix global clear of unique values for Faker::UniqueGenerator #1355

Merged
merged 3 commits into from
Sep 18, 2018
Merged

Fix global clear of unique values for Faker::UniqueGenerator #1355

merged 3 commits into from
Sep 18, 2018

Conversation

kolasss
Copy link
Contributor

@kolasss kolasss commented Sep 4, 2018

Global clear of unique values did not work after first global clear, test included

@vbrazo
Copy link
Member

vbrazo commented Sep 4, 2018

@MarcPer Could you take a quick look at this please?

@MarcPer
Copy link
Contributor

MarcPer commented Sep 6, 2018

Sure, I'll take a look.

@MarcPer
Copy link
Contributor

MarcPer commented Sep 6, 2018

This was a good test and it caught something I missed before. Whenever Faker::UniqueGenerator.clear is called, it clears @previous_results for all @unique instances in the marked_unique set. However, it also removes these instances from the set so, next time Faker::UniqueGenerator.clear is called, these instances will not be cleared anymore.

@kolasss 's change fixes this, but I would suggest an alternative, since removing the marked_unique.clear means the marked_unique set is never cleared and can grow indefinitely. The alternative is adding this to the Faker::UniqueGenerator#method_missing:

self.class.marked_unique.add(self)

With this, if any dangling instance of UniqueGenerator was previously removed from the set, it will add itself if it is ever used again. Since marked_unique is a set, it doesn't matter that add is called multiple times. And if this line is added, the same line can be removed from the initializer.

@MarcPer
Copy link
Contributor

MarcPer commented Sep 6, 2018

Just to be sure if my alternative suggestion wouldn't negatively impact the benchmarks, I redid what was described here with the proposed change, and the method using the marked_unique set is still consistently faster than calling ObjectSpace.each_object(self, &:clear).

The only new test I did was setting the second parameter in the benchmark function to higher values (20 and 40):

Benchmark.bm do |x|
  x.report { create_clear(1000, 40) }
  x.report { create_old_clear(1000, 40) }
end

#       user     system      total        real
#   5.558510   0.000000   5.558510 (  5.558902)
#   9.882145   0.000000   9.882145 (  9.882653)

This means the method_missing would be called 40 times between every UniqueGenerator.clear. Although this puts the old and new methods a bit closer, the new one still performs better.

@kolasss
Copy link
Contributor Author

kolasss commented Sep 7, 2018

Why not extract marked_unique.clear to a new method?
Describe it in the docs and we a good.
I think that memory consumption is not a big problem since Faker used in tests and rake tasks usually

@MarcPer
Copy link
Contributor

MarcPer commented Sep 7, 2018

I think for the typical user, having to worry about the inner workings of the library and calling two methods instead of one shouldn't be necessary. As a user, I would like to clear the values so I can regenerate them, without being concerned of a list somewhere keeping the references.

About the memory consumption, I would agree, except that the change that added the marked_unique was done to address an issue a user had with generating and clearing a big amount of data. So, even though it is not critical, it is still nice to limit this growth a bit.

And what about my suggestion? Do you see any downside for using it?

@kolasss
Copy link
Contributor Author

kolasss commented Sep 10, 2018

@MarcPer I think it's good. Should I create a new PR?

@MarcPer
Copy link
Contributor

MarcPer commented Sep 10, 2018

@kolasss, I think you already did a good job with your unit tests - I don't see a need for creating a new PR. I would only do a new commit changing lib/helpers/unique_generator.rb again to:

  • re-include the marked_unique.clear in self.clear
  • move the line self.class.marked_unique.add(self) from initialize to the beginning of method_missing

@vbrazo
Copy link
Member

vbrazo commented Sep 17, 2018

Thanks for the discussion. I'll take a look later on and get it merged 👍

@vbrazo vbrazo merged commit 6056486 into faker-ruby:master Sep 18, 2018
davidmorton0 pushed a commit to davidmorton0/faker that referenced this pull request Jul 12, 2021
…uby#1355)

* Fix global clear of unique values for Faker::UniqueGenerator

* Update CHANGELOG.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants