Optimize transform of vector<bool> for more predicates
#5796
+111
−4
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Towards #625, specifically #625 (comment) item 3
Follow up to #5769
➡️ Optimization
@rodiers asked for set subtraction.
Out of the standard predicates
std::greaterdoes the trick: it is only true if the element is in first set and not in second set. Orstd::lessif converse. For completeness, let's add all four remaining comparisons.❓ What it even is
Apparently, there's no universally accepted name for the corresponding integer operation, unlike
xnorforequal_to.In the benchmark, we can keep comparison names, like
less_equal. But we also need to name functors. Here are some ideas:implyandnimply. The other two, when the other input is negated, are converse implication and converse nonimplication,andn(scalar, BMI1),pandn(MMX, SSE2), and more flavors ofpandnfor bigger vectors and element masks, all accessible as intrinsics, and generated automatically by the compiler. We can call these opsandnandorntherefore. With slight confusion withnand/northat we don't use anyway, and with no obvious way how to mark one vs the other inputs as negated.In the PR I went with the Wikipedia naming, but I'm open to any other option
⏱️ Benchmark results
transform_two_inputs_aligned<less<>>/64transform_two_inputs_aligned<less<>>/4096transform_two_inputs_aligned<less<>>/65536🚗 Drive-by
There's a concise way to test for
void, done that.