-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize integer arithmetic #7553
Conversation
CT Test Results 3 files 133 suites 48m 53s ⏱️ For more details on these failures, see this check. Results for commit 4ca352e. ♻️ This comment has been updated with latest results. To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass. See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally. Artifacts// Erlang/OTP Github Action Bot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
36c0ff3
to
be5196d
Compare
e1e28c3
to
0da399f
Compare
Fuse a multiplication operator followed by an addition operator. That will generally reduce the number of instructions compared to having separate operators.
We used to replace division by a power of two with a right shift only when the dividend was known to be a positive integer. Extend the implementation to do right shift when the range of the dividend is unknown.
Inline the code for right shift a small operand any number steps. We used to call a helper routine when the shift count exceeded the number of bits in a small.
The routine for squaring a big integer did not have all optimizations that the multiplication routine had.
This commit implements the Karatsuba algorithm in a way that reduces the number of additions, resulting in a measureable performance improvement for multiplication of large integers.
0da399f
to
4ca352e
Compare
This pull request optimizes some of the arithmetic operations for integers.
The most noticeable improvement is in multiplication of large integers. Consider this benchmark:
Its running time is dominated by multiplication of large integers. This PR reduces the running time from about 0.45 seconds down to about 0.32 seconds on my computer (an M1 MacBook Pro).