-
Notifications
You must be signed in to change notification settings - Fork 219
Description
Search before asking
- I have searched the Multimodal Maestro issues and found no similar bug report.
Bug
The default LoRA config used in maestro is
maestro/maestro/trainer/models/florence_2/checkpoints.py
Lines 50 to 57 in cecc78f
| config = LoraConfig( | |
| r=8, | |
| lora_alpha=16, | |
| lora_dropout=0.05, | |
| bias="none", | |
| target_modules=["q_proj", "o_proj", "k_proj", "v_proj", "linear", "Conv2d", "lm_head", "fc2"], | |
| task_type="CAUSAL_LM", | |
| ) |
LoRA config used in the Florence-2 fine-tuning on custom dataset Roboflow notebook is the following:
config = LoraConfig(
r=8,
lora_alpha=8,
target_modules=["q_proj", "o_proj", "k_proj", "v_proj", "linear", "Conv2d", "lm_head", "fc2"],
task_type="CAUSAL_LM",
lora_dropout=0.05,
bias="none",
inference_mode=False,
use_rslora=True,
init_lora_weights="gaussian",
revision=REVISION
)For the poker-cards-fmjio dataset, the default LoRA config of maestro results in a mAP50 value of 0.20, but the Roboflow notebook config results in a mAP50 value of 0.52. I experimentally found a config that results in a mAP50 value of 0.71. Please see Minimal Reproducible Example for more.
Environment
- multimodel-maestro = 1.0.0
- OS: Ubuntu 20.04
- Python: 3.10.15
Minimal Reproducible Example
I used 3 variants of LoRA config and results are as described below:
Configs
Maestro default
config = LoraConfig(
r=8,
lora_alpha=16,
target_modules=["q_proj", "o_proj", "k_proj", "v_proj", "linear", "Conv2d", "lm_head", "fc2"],
task_type="CAUSAL_LM",
lora_dropout=0.05,
bias="none",
)Maestro default + Gaussian init
config = LoraConfig(
r=8,
lora_alpha=16,
target_modules=["q_proj", "o_proj", "k_proj", "v_proj", "linear", "Conv2d", "lm_head", "fc2"],
task_type="CAUSAL_LM",
lora_dropout=0.05,
bias="none",
init_lora_weights="gaussian",
)Roboflow notebook default
config = LoraConfig(
r=8,
lora_alpha=8,
target_modules=["q_proj", "o_proj", "k_proj", "v_proj", "linear", "Conv2d", "lm_head", "fc2"],
task_type="CAUSAL_LM",
lora_dropout=0.05,
bias="none",
inference_mode=False,
use_rslora=True,
init_lora_weights="gaussian",
)Roboflow notebook default except lora_alpha=16
config = LoraConfig(
r=8,
lora_alpha=16,
target_modules=["q_proj", "o_proj", "k_proj", "v_proj", "linear", "Conv2d", "lm_head", "fc2"],
task_type="CAUSAL_LM",
lora_dropout=0.05,
bias="none",
inference_mode=False,
use_rslora=True,
init_lora_weights="gaussian",
)Metrics
I used the Roboflow notebook to run the pipeline for 10 epochs and compute the metrics. I have used the new evaluation API as follows:
mean_average_precision = sv.metrics.MeanAveragePrecision().update(predictions, targets).compute()
map50_95 = mean_average_precision.map50_95
map50 = mean_average_precision.map50
p = sv.metrics.Precision().update(predictions, targets).compute()
precision_at_50 = p.precision_at_50
r = sv.metrics.Recall().update(predictions, targets).compute()
recall_at_50 = r.recall_at_50Results
| Config | mAP50 | mAP50-95 | Precision50 | Recall50 |
|---|---|---|---|---|
| Maestro default | 0.20 | 0.18 | 0.21 | 0.14 |
| Maestro default + Gaussian init | 0.32 | 0.30 | 0.54 | 0.35 |
| Roboflow notebook default | 0.52 | 0.47 | 0.66 | 0.58 |
| Roboflow notebook default except lora_alpha=16 | 0.71 | 0.65 | 0.78 | 0.75 |
Conclusion
Using lora_alpha=16 in Roboflow notebook default LoRA config results in much better performance with same number of epochs.
Questions
- Should maestro give LoRA config control to users? Probably users can then play with the values and find the best config that works for them. It might be defined as
tomlorjsonor any other format file and then users provide the path of the config to maestro CLI.
Additional
No response
Are you willing to submit a PR?
- Yes I'd like to help by submitting a PR!