[HW] Add HWVectorization pass #9222
Open
+1,003
−1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This patch introduces the HWVectorization pass, which identifies bitwise patterns in hardware modules that can be represented as vectorized operations instead of per-bit logic.
The pass aims to simplify the IR by grouping related scalar bit operations (such as
comb.extractandcomb.concat) into higher-level vector constructs likecomb.reverse,comb.replicate, or direct multi-bitcomb.and,comb.or, andcomb.xor.The pass scans each hw.module and identifies groups of bit-level operations that can be merged into vector-level constructs. This version supports several key patterns based on bit-level dataflow analysis and structural analysis.
This patch was co-authored by @RosaUlisses.
Supported transformations include:
1. Linear concatenations (identity):
Pattern: Bits are extracted in ascending order (identity permutation) and concatenated.
Transformation: The entire
comb.concatchain is replaced with the original input vector.2. Bit reversal:
Pattern: Bits are extracted in descending (reverse) order and concatenated.
Transformation: The chain is replaced with a single
comb.reverse.3. Structural Patterns (e.g., Vectorized Mux)
Pattern: Isomorphic, bit-parallel logic cones are detected. For example, a scalarized mux structure that uses a replicated i1 control signal for each bit.
Transformation: The replicated scalar operations are collapsed into equivalent vector-level operations (e.g.,
comb.replicate,comb.and,comb.xor,comb.or).4. Partial Vectorization (Chunking):
Pattern: The pass identifies contiguous sub-ranges (chunks) that can be vectorized independently, even if the entire bus cannot be.
Transformation: The pass vectorizes the identifiable chunks (e.g., a linear chunk) and leaves the remaining scalar or structural logic as another chunk, then concatenates the chunks back together.
Patterns not transformed
The pass does not modify modules with cross-bit dependencies or non-linear control flows.
For example: