Add support for analyzing evaluators with custom cross-annotations #281
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Firstly, thanks a lot for creating and sharing the AlpacaEval package! I am finding it very useful (and well documented).
This PR fixes a small bug when analyzing evaluators on a custom (new) cross-annotation dataset. I have found that the
main.analyze_evaluators
function does not support this use-case yet. In particular, thealpaca_eval.analyze.Analyzer
class assumes that the default cross-annotation dataset is being used when computing the correlations. As thegenerator
column is not present in this dataset, it is being extracted/matched from the main annotation dataset (referring to this line). This matching fails if you use a different cross-annotation dataset. Thus, I updated the if-statement to only run if there is nogenerator
columns present. If thegenerator
is present in the custom cross-annotation dataset (as below), it no longer runs the specific matching code - and does not throw an error.Let me know if this fix would be helpful to add.
Reproducing use-case
Simple code to test the default annotator on a custom cross-annotation dataset:
It can be run with the following (toy)
test_custom_crossannotations.json
file: