OOM for causal lm evaluation and missing logging

### Please check that this issue hasn't been reported before.

- [x] I searched previous [Bug Reports](https://github.com/axolotl-ai-cloud/axolotl/labels/bug) didn't find any similar reports.

### Expected Behavior

Run details: 4 A100 with 80GB is used. I am fine-tuning a 30B model. Deepspeed zero3 bf16 is used to run. When tested with FSDP1, run immediatly in OOM. The dataset has alpaca style prompt.


### Current behaviour

Firstly, I am trying to run causal eval during training. After loss evaluation, that takes between 18GB and 30GB, the casual loss consumes all 80GB and with bigger eval batch size rises OOM. Why this happenning?

Secondly, some metrics are not reported when run for comet, ter and chrf:  
`{'memory/max_mem_active(gib)': 77.69, 'memory/max_mem_allocated(gib)': 77.69, 'memory/device_mem_reserved(gib)': 78.43, 'epoch': 0}`, while sacrebleu and perplexity report properly as `{'eval_sacreblue': 5.231321, 'memory/max_mem_active(gib)': 77.69, 'memory/max_mem_allocated(gib)': 77.69, 'memory/device_mem_reserved(gib)': 78.43, 'epoch': 0}`.

### Steps to reproduce

axolotl train script.yaml

### Config yaml

```yaml
base_model: TildeAI/TildeOpen-30b
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer

load_in_8bit: false
load_in_4bit: false
strict: false
bf16: auto
tf32: false
float32: false

# Adapter parameters
adapter: lora
lora_r: 32
lora_alpha: 32
lora_mlp_kernel: true
lora_qkv_kernel: true
lora_o_kernel: true
lora_dropout: 0.05 #0.1 if num_parameters < 2e10 else 0.05 (from qlora paper)
lora_target_modules:
  - q_proj
  - k_proj
  - v_proj
  - o_proj
  - gate_proj
  - up_proj
  - down_proj

# Dataset settings
datasets:
  - path: "./train_10K.jsonl"
    ds_type: json
    split: train
    type: alpaca


test_datasets:
  - path: "./val_2K.jsonl"
    ds_type: json
    split: train
    type: alpaca

shuffle_merged_datasets: true
sequence_len: 2048
sample_packing: false
pad_to_sequence_len: true
eval_sample_packing: false

# Saving and logging settings
logging_steps: 2
saves_per_epoch: 3
evals_per_epoch: 8
gc_steps: 40

# Training parameters
gradient_accumulation_steps: 2
micro_batch_size: 2
eval_batch_size: 4
num_epochs: 1
optimizer: adamw_torch_fused
lr_scheduler: cosine
learning_rate: 0.00002 # 2e-5
warmup_ratio: 0.05
weight_decay: 0.01
# Gradient clipping max norm
max_grad_norm: 1

do_causal_lm_eval: true
eval_causal_lm_metrics:
  - chrf
#eval_table_size: 1
#eval_causal_lm_metrics: ['sacrebleu', 'comet', 'ter', 'chrf', 'perplexity']

# Other parameters
gradient_checkpointing: true
flash_attention: true

train_on_inputs: false
group_by_length: false
```

### Possible solution

_No response_

### Which Operating Systems are you using?

- [x] Linux
- [ ] macOS
- [ ] Windows

### Python Version

3.11

### axolotl branch-commit

main

### Acknowledgements

- [x] My issue title is concise, descriptive, and in title casing.
- [x] I have searched the existing issues to make sure this bug has not been reported yet.
- [x] I am using the latest version of axolotl.
- [x] I have provided enough information for the maintainers to reproduce and diagnose the issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

OOM for causal lm evaluation and missing logging #3203

Please check that this issue hasn't been reported before.

Expected Behavior

Current behaviour

Steps to reproduce

Config yaml

Possible solution

Which Operating Systems are you using?

Python Version

axolotl branch-commit

Acknowledgements

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

OOM for causal lm evaluation and missing logging #3203

Description

Please check that this issue hasn't been reported before.

Expected Behavior

Current behaviour

Steps to reproduce

Config yaml

Possible solution

Which Operating Systems are you using?

Python Version

axolotl branch-commit

Acknowledgements

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions