[EISW-191805] Enable quantization for FP4 type #175

Jacenty-And-Intel · 2025-11-14T12:50:07Z

Summary

After the vpux type is added for float4_e2m1, quantization should be enabled for this type, as it is for FP8. This MLIR change will allow for further adjustments to enable FP4 quantization in the compiler and enable the addition of PSS tests.

JIRA ticket

EISW-191805

Related PR in NPU Compiler and/or OpenVINO repository with sub-module update

PR-22333

Other related tickets

List tickets for additional work, eg, something was found during review but you agreed to address it in another Jira

E-xxxxx

andrey-golubev · 2025-11-14T13:20:02Z

Can this be first brought to upstream MLIR? Also, does it by any chance conflict with what @Roman-Pevnyi is trying to do in upstream? (see llvm/llvm-project#152966)

Jacenty-And-Intel · 2025-11-14T13:54:17Z

Can this be first brought to upstream MLIR?

@andrey-golubev Wanted to merge it in the fork first. Right now enabling PSS tests for FP4 and further work on quantization to f4 are blocked without the change in MLIR.

Also, does it by any chance conflict with what @Roman-Pevnyi is trying to do in upstream? (see llvm/llvm-project#152966)

It does not conflict, in fact, it works very well with Roman's change. Once that change is approved, enabling FP4 will be much simpler, and enabling further types will be even easier. However, we don't know when it will be merged upstream, and it is currently a blocker on our side.

sartil · 2025-11-14T14:39:03Z

Do you have a testing PR in vpux-plugin that verifies CI when updating this submodule, to be linked here?
Also, could you please add the status of running check-mlir tests with this changes (see https://mlir.llvm.org/getting_started/TestingGuide/ on how to run the target)?

andrey-golubev · 2025-11-14T14:53:28Z

@andrey-golubev Wanted to merge it in the fork first. Right now enabling PSS tests for FP4 and further work on quantization to f4 are blocked without the change in MLIR.

I would strongly prefer to do it the other way around since merging to upstream is usually a painful process. Quantization changes are also evidently painful: the dialect evolved quite a bit since LLVM 18 up to LLVM 21 (we're now at LLVM 20).

Also, could you please add the status of running check-mlir tests with this changes (see https://mlir.llvm.org/getting_started/TestingGuide/ on how to run the target)?

I think this is now nice to do, but we also have automated tests running in CI! (See https://github.com/intel/npu-plugin-llvm/actions/runs/19365101483/job/55416928201?pr=175)

andrey-golubev

Looks good to me. And it also looks fairly non-controversial. Please try to bring it up to the upstream community first!

Generally, it seems like most of the checks would benefit from isa<FloatType> kind of check. We just need 1 place somewhere (e.g. in verifier?) where we explicitly check that types are f4 / f8.

P.S.: Not approving yet - perhaps @sartil or @ZoranZomborat may want to take a look

mlir/lib/Dialect/Quant/IR/QuantTypes.cpp

mlir/lib/Dialect/Quant/IR/TypeParser.cpp

andrey-golubev

Discussed offline, I think we can merge it "as is".

Jacenty-And-Intel · 2025-11-24T11:01:51Z

Rebased. CI run with the submodule update in the PR-22333.

@ZoranZomborat @sartil Could you please review?

sartil

LGTM!

ZoranZomborat

Great extension of the current logic!

ZoranZomborat · 2025-11-25T15:50:36Z

@andrey-golubev @nikita-kud can you help with merge here

Jacenty-And-Intel requested a review from a team as a code owner November 14, 2025 12:50

Jacenty-And-Intel changed the title ~~Enable quantization for FP4 type~~ [EISW-191805] Enable quantization for FP4 type Nov 14, 2025

andrey-golubev requested review from andrey-golubev and nikita-kud November 19, 2025 08:53

andrey-golubev reviewed Nov 19, 2025

View reviewed changes

mlir/lib/Dialect/Quant/IR/QuantTypes.cpp Show resolved Hide resolved

mlir/lib/Dialect/Quant/IR/QuantTypes.cpp Show resolved Hide resolved

mlir/lib/Dialect/Quant/IR/TypeParser.cpp Show resolved Hide resolved

andrey-golubev approved these changes Nov 24, 2025

View reviewed changes

Enable quantization for FP4 type

708bba0

Jacenty-And-Intel force-pushed the f4e2m1_mlir branch from 218bcff to 708bba0 Compare November 24, 2025 10:54

sartil approved these changes Nov 24, 2025

View reviewed changes

Natan-GabrielTiutiuIntel approved these changes Nov 24, 2025

View reviewed changes

ZoranZomborat approved these changes Nov 25, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[EISW-191805] Enable quantization for FP4 type #175

[EISW-191805] Enable quantization for FP4 type #175

Jacenty-And-Intel commented Nov 14, 2025 •

edited

Loading

Uh oh!

andrey-golubev commented Nov 14, 2025 •

edited

Loading

Uh oh!

Jacenty-And-Intel commented Nov 14, 2025

Uh oh!

sartil commented Nov 14, 2025 •

edited

Loading

Uh oh!

andrey-golubev commented Nov 14, 2025

Uh oh!

andrey-golubev left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

andrey-golubev left a comment

Uh oh!

Jacenty-And-Intel commented Nov 24, 2025

Uh oh!

sartil left a comment

Uh oh!

ZoranZomborat left a comment

Uh oh!

ZoranZomborat commented Nov 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[EISW-191805] Enable quantization for FP4 type #175

Are you sure you want to change the base?

[EISW-191805] Enable quantization for FP4 type #175

Conversation

Jacenty-And-Intel commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

JIRA ticket

Related PR in NPU Compiler and/or OpenVINO repository with sub-module update

Other related tickets

Uh oh!

andrey-golubev commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Jacenty-And-Intel commented Nov 14, 2025

Uh oh!

sartil commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andrey-golubev commented Nov 14, 2025

Uh oh!

andrey-golubev left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

andrey-golubev left a comment

Choose a reason for hiding this comment

Uh oh!

Jacenty-And-Intel commented Nov 24, 2025

Uh oh!

sartil left a comment

Choose a reason for hiding this comment

Uh oh!

ZoranZomborat left a comment

Choose a reason for hiding this comment

Uh oh!

ZoranZomborat commented Nov 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Jacenty-And-Intel commented Nov 14, 2025 •

edited

Loading

andrey-golubev commented Nov 14, 2025 •

edited

Loading

sartil commented Nov 14, 2025 •

edited

Loading