#149042 seems to cause a lot of regressions on avx512, and since my loads are based on LTO and PGO, I'm share the samples as lld reproduce.
149042.zip
This example is mujs, running v8-v7 benchmark.
before #149042
./mujs.exe .\run.js
Richards: 559
DeltaBlue: 738
Crypto: 514
RayTrace: 809
EarleyBoyer: 1008
RegExp: 569
Splay: 1405
NavierStokes: 1315
----
Score (version 7): 808
after #149042
./mujs.exe .\run.js
Richards: 530
DeltaBlue: 698
Crypto: 365
RayTrace: 688
EarleyBoyer: 938
RegExp: 565
Splay: 1407
NavierStokes: 1295
----
Score (version 7): 740
The main difference seems to be jsR_run, which seems to be partially vectorized after #149042, but gets slower on tigerlake.
