-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tokenization mismatch #75
Comments
I believe so. Here is my
Here is my pretrained
|
There was a bug that HF Llama-3 wouldn't prepend Please check whether your model weights are up-to-date. |
Still didn't fix it. I have deleted cached weights and check that the |
We notice that Llama-3 changes |
In theory, can I also use this model? https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat
I received an error while training again
Meta-Llama-3-8B-Instruct is also the same error |
Try to edit here other than editing the configuration of base model. |
I changed it to this
training is ok
merged sh
|
I edited the config as you said. The training was fine but during inference i got
for every sample. The output tokens look like this
|
@Gary2018X It seems no relation with Bunny. Please try to Google. |
@swhoosh What about the loss curve? |
I trained for only 20 steps just to test it out first. The loss seemed fine when I trained full epochs yesterday where I edited the model's config instead of the bunny's like what you recommended. However, they still had the same problem. FYI, I was able to get the expected result from |
Maybe there exists a huge gap between medical images and knowledge and regular images and knowledge. |
Well, Phi-2 did actually work during our testing and I was able to get llama3 to work before the recent config update. Can you try reproducing the finetuning result on your end to ensure that the model is behaving correctly? |
After being consistent with this change, my problem was resolved and I was able to normally. |
@Gary2018X are you able to inference? My inference still produce the same result as
I had check my llama3 version / use the latest dev branch. |
It may be related to your base model |
Although I can infer normally, the result is not as good as qwen1.8b yet |
@swhoosh @Gary2018X We would keep using |
Close the issue for now if there's no further discussions. Feel free to reopen it if there's any other questions. |
I tried finetuning my model after stage 1. Apparently, there are tokenization mismatches and the loss is 0.
Do you have any ideas what might be the problem.
Thanks!
sh finetune_full.sh
The text was updated successfully, but these errors were encountered: