Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalizing the chat_template prompt strategy #1660

Merged
merged 1 commit into from
May 28, 2024

Conversation

fozziethebeat
Copy link
Contributor

Description

The strategy now supports configuring several fields:

  • The data field holding message arrays
  • the role and content fields for each message
  • role mapping from source to target types

additionally this adds a sample llama3-8b instruct template using the chat template

Fixes #1654

Motivation and Context

#1654

How has this been tested?

Tested via

pytest --ignore tests/e2e

Further it was tested by running

python -m axolotl.cli.preprocess examples/llama-3/instruct-lora-8b.yml

And manually inspecting the emitted sample tokens

Screenshots (if appropriate)

Screenshot 2024-05-27 at 2 47 46 PM

Types of changes

  • Generalizing input data configuration options

Social Handles (Optional)

fozziethebeat

The strategy now supports configuring several fields: * The data field holding message arrays * the role and
content fields for each message * role mapping from source to target types

additionally this adds a sample llama3-8b instruct template using the chat template
Copy link
Collaborator

@winglian winglian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very much needed. Thank you again!

Comment on lines +9 to +16
chat_template: llama3
datasets:
- path: fozziethebeat/alpaca_messages_2k_test
type: chat_template
chat_template: llama3
field_messages: messages
message_field_role: role
message_field_content: content
Copy link

@hammoudhasan hammoudhasan May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to assign the chat_template twice in dataset and outside? I'm testing this PR. Any difference between the two "chat_template" settings?

I feel that passing type: chat_template and the related field keys is already specifying how to load the data. The value of the chat_template should be the template used for tokenization for training.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we already have some handling when using this with sharegpt and chatml, so I've updated that to handle it automatically for the general case here: #1664

@winglian winglian merged commit cc11c6b into axolotl-ai-cloud:main May 28, 2024
7 checks passed
@fozziethebeat
Copy link
Contributor Author

Great! Thanks for merging!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Generalize the chat_template prompt strategy with more configuration options
3 participants