-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance improvements for factor #1525
Conversation
It is clearer to see what is going on, as opposed to passing around an unmarked `Vec<u64>`, and there is a single place to add invariants checks. This is also a more compact memory representation: each prime factor is represented only once, with an additional byte for multiplicity. The performance impact is however not significant.
Also decoupled the factorisation methods; now factor::factor contains the logic that chains the different algorithms and aggregates results. As a side-effect, rho::factor now performs extraneous allocations (as each recursive step creates a new `Factors` value, which is then aggregated into the previous one) but there is no significant performance impact.
No significant performance impact (most of the time is spent elsewhere), but an easy and satisfying fix nevertheless.
When the remainder is smaller than the max. entry in the table, it is guaranteed to be prime.
50% performance improvement on factoring all numbers between 2 and 10⁶.
Another 6.97% runtime improvement
Replace iterated division with u64::trailing_zeros, hoist the selection of `mul` out of the loop, another cool 49.5% runtime improvement.
Another 36% improvement.
I am going to have a deeper look at the implementation of Pollard's rho, but I suspect further improvements will require switching to a wholly-different factoring algorithm. Factoring Small to Medium Size Integers: An Experimental Comparison (2010) suggests that Shanks's SQUFOF (square forms factorization) is the fastest algorithm for integers up to ~60 bits. A possible alternative I might investigate is Hart's “one-line factoring algorithm”, which is said to be competitive with SQUFOF and simpler to implement. PS: This would presumably be the subject of a second PR. |
Thanks for the PR. There are some rust format and clippy issues, could you please fix them? |
Done; I was trying first to recall how to deal with not having NLLs (as the build is currently failing on obsolete versions of Rust). |
I didn't see the format issues, though (or at least, running |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't see the format issues, though (or at least, running
cargo fmt
didn't introduce any change)
It looks like it's just cargo clippy
issues, specifically large numbers of single character variable names, unneeded returns, long literals lacking separators, ....
If you look at the "Checks" tabs and pull up the CICD section, it lists all the warnings.
It is acceptable to use an #[allow(...)]
for the naming issue if you have reasons.
Thanks for your work!
Yes, I already dealt with it (by using 1 fewer variable >_>')
You are welcome <3 |
Instead of computing a^r and a^(n-1) = a^(r 2ⁱ) separately, compute the latter by repeatedly squaring the former. 33.6% performance improvement
Co-authored-by: Roy Ivy III <rivy.dev@gmail.com>
This is “merely” 37× slower than GNU factor, now, and I still haven't implemented any fancier number theory (aside from extracting dividers from the M-H primality test) |
@rivy @sylvestre I found another 2× speedup with a loop exchange (and essentially inlining a batch version of |
@nbraud , thanks! Sounds like your having fun! Mush on at your leisure. |
@rivy Thanks a bunch for reviewing and merging so quickly :O |
And yes, golfing this to be ~93% faster was fun :3 |
First pass at fixing #1456: after those changes,
factor
is 4× faster, and“only” 48× slower than the GNU implementation (when factoring all numbers from 2
to 10⁶).
Vec
of factors with a datatype that only stores each prime onceis_prime
(~50% of the perf. gain)u64::trailing_zeros