Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Penalty rewards #52

Merged
merged 40 commits into from
Nov 2, 2023
Merged

Penalty rewards #52

merged 40 commits into from
Nov 2, 2023

Conversation

p-ferreira
Copy link
Contributor

@p-ferreira p-ferreira commented Oct 27, 2023

This PR seeks to propose a penalty mechanism with:

- Penalty rewards: A new family of function models that will act as penalty functions accordingly to the established definition of the function. This PR also introduces a task-criteria schema that enable criteria assignment to a given task through prompt definition and code validation.

Tasks done so far:

  • add generic task and criteria schema that can be easily expanded
  • add SummaryTask, QuestionGenerationTask and QuestionAnswerTask
  • adds MatchLengthCriteria with support to characters, words, sentences and paragraphs
  • adapt current workflow to include task validation
  • adapt current workflow to not concatenate previous answer and questions
  • add penalty functions
  • incorporate legacy task validator reward model functionality into new KeywordMatch penalty function
  • add SentenceMatch penalty function

TODO (Once current design is approved):

  • reorganize and properly document the code generated
  • add unit tests for implemented functionality

Wandb run with penalty functions implemented:

@Unkownman086
Copy link

With the new prompting style you receive much worse ratings from the RLHF model.
I have tried with same context and i am able to measure from average GPT-3.5, vicuna response much worse rating on RLHF model, when normalized ≈ 30% worse.

I whould think the RLHF model doesn't do well, when the question is at the start of the prompt

@p-ferreira p-ferreira changed the title Penalty rewards [WIP] Penalty rewards Nov 1, 2023
@p-ferreira p-ferreira marked this pull request as ready for review November 1, 2023 18:34
neurons/validators/validator.py Show resolved Hide resolved
prompting/validators/tasks.py Outdated Show resolved Hide resolved
prompting/validators/tasks.py Outdated Show resolved Hide resolved
prompting/validators/reward/keyword_penalty.py Outdated Show resolved Hide resolved
prompting/validators/reward/keyword_penalty.py Outdated Show resolved Hide resolved
prompting/validators/penalty/sentence_match.py Outdated Show resolved Hide resolved
prompting/validators/penalty/sentence_match.py Outdated Show resolved Hide resolved
prompting/validators/penalty/penalty.py Outdated Show resolved Hide resolved
prompting/validators/penalty/penalty.py Outdated Show resolved Hide resolved
Copy link
Contributor

@steffencruz steffencruz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make penalty_scale_factor in prompting/validators/criteria.py a gaussian as i showed in the plot.

prompting/validators/criteria.py Outdated Show resolved Hide resolved
@p-ferreira p-ferreira merged commit 3f21ca0 into staging Nov 2, 2023
4 checks passed
@p-ferreira p-ferreira mentioned this pull request Nov 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants