-
-
Notifications
You must be signed in to change notification settings - Fork 780
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Keyword "assistant" Error with ShareGPT Datasets #752
Comments
I checked that linked dataset. It seems to be human/gpt. In case you want to swap keys, please check this. It swaps roles using a simple dictionary map. |
@NanoCode012 I said "some "from" values are "assistant" rather than "gpt"". You checked only some data from that dataset, not all. |
Oh I see! You can use this kind of role_map then. role_map = {"assistant": "gpt", "human": "human", "gpt": "gpt"} |
@NanoCode012 Thanks! I'm not sure I'm doing it the way you mentioned. I updated the role_map, but still get the error. Please see the two screenshots. |
I fixed this by updating SimpleShareGPTPromptTokenizingStrategy with the role_map you provided. |
🔖 Feature description
On ShareGPT datasets such as shibing624/sharegpt_gpt4, some "from" values are "assistant" rather than "gpt", so, the program raises the keyword error.
✔️ Solution
Maybe take both assistant and gpt as the machine's responses?
❓ Alternatives
No response
📝 Additional Context
No response
Acknowledgements
The text was updated successfully, but these errors were encountered: