Replies: 1 comment
-
|
It's used for changes to the quantization formats themselves, see ggml-org/llama.cpp#1508 when the scaling factor in Q4 and Q8 changed from F32 to F16. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Reading the spec file required keys, one of those is
general.quantization_version, but I am not clear exactly what this field is for. According to the docs it is unrelated to the quantization scheme. In what case would this be used?Beta Was this translation helpful? Give feedback.
All reactions