-
Notifications
You must be signed in to change notification settings - Fork 282
Higher radix multiplier encoding #7991
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
tautschnig
wants to merge
13
commits into
diffblue:develop
Choose a base branch
from
tautschnig:feature/multiplier-encoding
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Higher radix multiplier encoding #7991
tautschnig
wants to merge
13
commits into
diffblue:develop
from
tautschnig:feature/multiplier-encoding
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
c5111f1
to
354bb4e
Compare
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## develop #7991 +/- ##
===========================================
- Coverage 79.09% 78.63% -0.46%
===========================================
Files 1699 1701 +2
Lines 196512 196571 +59
===========================================
- Hits 155428 154575 -853
- Misses 41084 41996 +912 ☔ View full report in Codecov by Sentry. |
354bb4e
to
ffe136e
Compare
ffe136e
to
2f37f7b
Compare
60fcbd3
to
5a5471a
Compare
We were still repeatedly using `bv[bv.size() - 1]` in place, when using `sign_bit(bv)` adds clarity and avoids having to understand encoding details. Also, use `.back()` to avoid unnecessary repeat arithmetic (which the compiler may or may not optimise away).
We can avoid lowering, and eventually re-use this as part of other algorithms.
1. Duplicate some code to specialise it for signed/unsigned and, thereby, add clarity. 2. De-duplicate code to handle all cases directly in lt_or_le. 3. Make sure we constant-propagate whatever is possible in unsigned comparison to avoid introducing unnecessary fresh literals.
The Radix-4 multiplier pre-computes products for x * 0 (zero), x * 1 (x), x * 2 (left-shift x), x * 3 (x * 2 + x) to reduce the number of sums by a factor of 2. Radix-8 extends this up to x * 7 (reducing sums by a factor of 3), and Radix-16 up to x * 15 (reducing sums by a factor of 4). This modified approach to computing partial products can be freely combined with different (tree) reduction schemes. Benchmarking results can be found at https://tinyurl.com/multiplier-comparison (shortened URL for https://docs.google.com/spreadsheets/d/197uDKVXYRVAQdxB64wZCoWnoQ_TEpyEIw7K5ctJ7taM/). The data suggests that combining Radix-8 partial product pre-computation with Dadda's reduction can yield substantial performance gains while not substantially regressing in other benchmark/solver pairs.
Uses extra sign-bit to keep bit widths small.
Implements the algorithm of Section 4 of "Further Steps Down The Wrong Path : Improving the Bit-Blasting of Multiplication" (see https://ceur-ws.org/Vol-2908/short16.pdf).
Prints a vector of literals in human-readable form, including a decimal representation when all literals are constants.
Implements a symbolic version of the algorithm proposed by Schönhage and Strassen in "Schnelle Multiplikation großer Zahlen", Computing, 7, 1971.
ed26c5a
to
a08b3f6
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The Radix-4 multiplier pre-computes products for x * 0 (zero), x * 1 (x), x * 2 (left-shift x), x * 3 (x * 2 + x) to reduce the number of sums by a factor of 2. Radix-8 extends this up to x * 7 (reducing sums by a factor of 3), and Radix-16 up to x * 15 (reducing sums by a factor of 4). This modified approach to computing partial products can be freely combined with different (tree) reduction schemes.
Benchmarking results can be found at
https://tinyurl.com/multiplier-comparison (shortened URL for https://docs.google.com/spreadsheets/d/197uDKVXYRVAQdxB64wZCoWnoQ_TEpyEIw7K5ctJ7taM/). The data suggests that combining Radix-8 partial product pre-computation with Dadda's reduction can yield substantial performance gains while not substantially regressing in other benchmark/solver pairs.