-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Please check that this issue hasn't been reported before.
- I searched previous Bug Reports didn't find any similar reports.
Expected Behavior
DPO Training Fails with Tool Role Messages - KeyError: 'tool'
Bug Description
DPO training fails when dataset contains role: "tool" messages, even though Llama 3.1 natively supports tool roles.
Error
KeyError: 'tool'
File "/axolotl/prompt_strategies/dpo/chat_template.py", line 57
"role": role_map[m[message_property_mappings["role"]]],
Reproduction
Data format:
{
"messages": [
{"role": "user", "content": "Search for AI info"},
{"role": "assistant", "tool_calls": [{"id": "call_123", "function": {"name": "search"}}]},
{"role": "tool", "name": "search", "tool_call_id": "call_123", "content": "Results..."}
],
"chosen": {"role": "assistant", "content": "Based on results..."},
"rejected": {"role": "assistant", "content": "I don't know."}
}Config:
base_model: meta-llama/Llama-3.1-8B-Instruct
rl: dpo
chat_template: llama3
datasets:
- path: data.jsonl
type: chat_template
field_messages: "messages"
field_chosen: "chosen"
field_rejected: "rejected"
message_property_mappings:
role: role
content: content
roles:
user: ["user"]
assistant: ["assistant"]
system: ["system"]
tool: ["tool"]
roles_to_train: ["assistant"]
Command: axolotl preprocess config.yaml
Expected Behavior
Should work since:
- Llama 3.1 supports tool roles natively
- SFT training works fine with same data/config
llama3chat template supports tools
Root Cause
DPO chat template processor missing tool role in role_map.
Request
Add native tool role support to DPO training to match SFT capabilities.
Current behaviour
I wrote above
Steps to reproduce
I wrote above
Config yaml
Possible solution
No response
Which Operating Systems are you using?
- Linux
- macOS
- Windows
Python Version
3.10
axolotl branch-commit
main
Acknowledgements
- My issue title is concise, descriptive, and in title casing.
- I have searched the existing issues to make sure this bug has not been reported yet.
- I am using the latest version of axolotl.
- I have provided enough information for the maintainers to reproduce and diagnose the issue.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working