Skip to content

Conversation

@nevans
Copy link
Collaborator

@nevans nevans commented Nov 25, 2025

Calling SequenceSet#normalize on a frozen set can be more than 4x faster, by simply re-parsing @string and scanning its elements, rather than fully generating a new string and comparing it with @string.

                           normal
   reparse and check:     20449.2 i/s
generate and compare:     20267.2 i/s - 1.01x  slower
             v0.5.12:      3090.2 i/s - 6.62x  slower

                frozen and normal
generate and compare:  19328485.2 i/s
   reparse and check:  17455122.3 i/s - 1.11x  slower
             v0.5.12:      3730.0 i/s - 5181.95x  slower

                         unsorted
   reparse and check:     16936.2 i/s
generate and compare:     16872.9 i/s - 1.00x  slower
             v0.5.12:      2583.6 i/s - 6.56x  slower

                         abnormal
generate and compare:     17610.8 i/s
   reparse and check:     16596.1 i/s - 1.06x  slower
             v0.5.12:      2560.3 i/s - 6.88x  slower

                  frozen unsorted
   reparse and check:     10089.5 i/s
             v0.5.12:      2333.7 i/s - 4.32x  slower
generate and compare:      2093.1 i/s - 4.82x  slower

                  frozen abnormal
   reparse and check:     10392.1 i/s
             v0.5.12:      2354.5 i/s - 4.41x  slower
generate and compare:      2124.3 i/s - 4.89x  slower

Please note that these results vary significantly based on benchmark settings (e.g: size of the sequence set) and randomized factors (e.g: how early in the string is the first out-of-order or abnormal string).

Also, I manually adjusted the benchmark in order to compare prior unreleased commits in this branch vs this PR, because #554 also provides a significant performance boost. So "generate and compare" includes #554, and "reparse and check" represents this PR.

Base automatically changed from sequence_set/drop-normalized-string to master November 25, 2025 18:31
Calling `SequenceSet#normalize` on a frozen set can be more than 4x
faster, by simply re-parsing `@string` and scanning its elements, rather
than fully generating a new string and comparing it with `@string`.

```
                           normal
   reparse and check:     20449.2 i/s
generate and compare:     20267.2 i/s - 1.01x  slower
             v0.5.12:      3090.2 i/s - 6.62x  slower

                frozen and normal
generate and compare:  19328485.2 i/s
   reparse and check:  17455122.3 i/s - 1.11x  slower
             v0.5.12:      3730.0 i/s - 5181.95x  slower

                         unsorted
   reparse and check:     16936.2 i/s
generate and compare:     16872.9 i/s - 1.00x  slower
             v0.5.12:      2583.6 i/s - 6.56x  slower

                         abnormal
generate and compare:     17610.8 i/s
   reparse and check:     16596.1 i/s - 1.06x  slower
             v0.5.12:      2560.3 i/s - 6.88x  slower

                  frozen unsorted
   reparse and check:     10089.5 i/s
             v0.5.12:      2333.7 i/s - 4.32x  slower
generate and compare:      2093.1 i/s - 4.82x  slower

                  frozen abnormal
   reparse and check:     10392.1 i/s
             v0.5.12:      2354.5 i/s - 4.41x  slower
generate and compare:      2124.3 i/s - 4.89x  slower
```

Please note that these results do vary based on benchmark settings, e.g:
size of the sequence set.
@nevans nevans force-pushed the sequence_set/faster-frozen-normalize branch from 4d0f345 to 0560cce Compare November 25, 2025 19:43
@nevans nevans added the performance related to CPU use, memory use, latency, etc label Nov 25, 2025
@nevans nevans merged commit 7caeadf into master Nov 25, 2025
32 checks passed
@nevans nevans deleted the sequence_set/faster-frozen-normalize branch November 25, 2025 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance related to CPU use, memory use, latency, etc

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants