You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Let users specify role and content fields for each message
Let the tokenizer chat template turn the conversation messages into the input prompt
Let the tokenizer chat template turn the rejected and chosen field content into the right completion prompt
Then boom, easy training on long conversation sequences.
✔️ Solution
Replicate the changes in #1660 with minor tweaks for DPO.
❓ Alternatives
This can all be done outside of axolotol with the user defined fields but it's a bit messy and risks getting things wrong with the tokenizer chat template.
📝 Additional Context
I can do the implementation
Acknowledgements
My issue title is concise, descriptive, and in title casing.
I have searched the existing issues to make sure this feature has not been requested yet.
I have provided enough information for the maintainers to understand and evaluate this request.
The text was updated successfully, but these errors were encountered:
Also, I have a have a fork of this working I just haven't gotten around to unit testing it. I have manually inspected the tokenization and am satisfied.
🔖 Feature description
This is basically #1660 but for DPO datasets:
rejected
andchosen
field content into the right completion promptThen boom, easy training on long conversation sequences.
✔️ Solution
Replicate the changes in #1660 with minor tweaks for DPO.
❓ Alternatives
This can all be done outside of axolotol with the user defined fields but it's a bit messy and risks getting things wrong with the tokenizer chat template.
📝 Additional Context
I can do the implementation
Acknowledgements
The text was updated successfully, but these errors were encountered: