Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hyperparameter #1

Closed
shrimonmuke0202 opened this issue Jun 20, 2023 · 7 comments
Closed

Hyperparameter #1

shrimonmuke0202 opened this issue Jun 20, 2023 · 7 comments
Labels
good first issue Good for newcomers

Comments

@shrimonmuke0202
Copy link

Hi,
What are the hyperparameters used for chemical-named entity recognition?

@ZJU-Fangyin
Copy link
Collaborator

Hello,

Thank you for your interest in our work. The hyperparameters used for the chemical-named entity recognition task can be found in Table 6 of the Appendix in our paper. Please note that these hyperparameters are provided for reference and haven't been fine-tuned for optimal performance, so they might not yield the best possible results.

Feel free to ask if you have further questions.

@shrimonmuke0202
Copy link
Author

Thanks for your answer, I went through your paper and could not find the number of epochs for which the model is fine-tuned and also I could find where I should pass steps in the fine-tuned training argument. Again, for molecular property prediction, you replace the ring and branch in the smiles, how do you do this in your preprocessing stage?

@ZJU-Fangyin
Copy link
Collaborator

The number of epochs used for fine-tuning the model is 20, and the conversion from SMILES to SELFIES is performed using the package provided by the project found at https://github.com/aspuru-guzik-group/selfies.

You do not need to input a specific number of steps. The model should handle this automatically based on the number of epochs and the size of your dataset.

@shrimonmuke0202
Copy link
Author

Thanks,

I finetuned the model on my custom NER dataset using the same prompt template and instructions as the ones used for the chemical entity recognition task in your work. However, when I attempted to generate output using your "generate.py" script, I received the same output multiple times. Could you please explain why this is happening?

@ZJU-Fangyin
Copy link
Collaborator

Hi,

Your issue is fairly common when using LLaMa. You might be able to circumvent this problem by adjusting some hyperparameters, such as:

  • Setting num_beam=1 to avoid beam search.
  • Setting do_sample=True to enable sampling.
  • Lowering top_k or top_p based on your specific task.
  • Lowering temperature.

In addition, you can refer to the original aplaca-lora project for more insights or methods of implementation.

@shrimonmuke0202
Copy link
Author

Thanks,
I have fine-tuned Llama, using your released molecular property prediction dataset. After fine-tuning the model, when I attempted to generate molecular properties for a new molecule, it consistently produced the same values for the Highest Occupied Molecular Orbital (HUMO), Lowest Unoccupied Molecular Orbital (LUMO), and the HUMO-LUMO gap.

@ZJU-Fangyin
Copy link
Collaborator

  1. Did you adjust the hyperparameters as suggested in our previous discussions? Maybe you can refer to this issue.
  2. Have you tried the molecules in your training set? If the problem persists, it could indicate that the pre-training process has not converged properly or encountered errors.
  3. I would recommend that you use our provided checkpoint directly, or continue fine-tuning based on our checkpoint.

@zxlzr zxlzr closed this as completed Jul 3, 2023
@zxlzr zxlzr added the good first issue Good for newcomers label Jul 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants