[Feature] Add reasonbench dataset #577

Skyfall-xzz · 2023-11-13T08:19:56Z

Motivation

Add a benchmark that support specially evaluating reasoning ability of LLMs.

Modification

Add preprocess methods and configs for several datasets, aiming evaluating the abilities of inductive reasoning, deductive reasoning, abductive reasoning, causal reasoning, symbolic reasoning and commonsense reasoning.

BC-breaking (Optional)

N/A.

Use cases (Optional)

After ensuring that OpenCompass is installed correctly according to the previous steps and the datasets are prepared, you can evaluate the performance of the LLaMA-7b model on the ReasonBench datasets using the following command:

python run.py --models hf_llama_7b --datasets reasonbench_ppl

tonysy · 2023-11-13T09:46:24Z

Please provide the prompt for generation setting(for chat model)

tonysy · 2023-11-13T12:31:51Z

Also please use pre-commit to lint the code

…he same category

add new features by other contributors

Skyfall-xzz · 2023-11-14T13:48:30Z

In the new commits, I provide prompt and configs for supporting generative inference as well as merge datasets that in the same category.

After ensuring that OpenCompass is installed correctly and the datasets are prepared, we are able to evaluate the performance of the LLaMA-7b model on the ReasonBench datasets in a generative way using the following command:

python run.py --models hf_llama_7b --datasets reasonbench_gen

tonysy

LGTM

* [Feature] Add reasonbench dataset * add configs for supporting generative inference & merge datasets in the same category * modify config filename to prompt version * fix codes to meet pre-commit requirements * lint the code to meet pre-commit requirements * Align Load_data Sourcecode Briefly * fix bugs * reduce code redundancy

Skyfall-xzz and others added 2 commits November 13, 2023 15:50

[Feature] Add reasonbench dataset

40239b5

Merge branch 'open-compass:main' into main

2957acc

mm-assistant bot assigned Leymore Nov 13, 2023

Skyfall-xzz marked this pull request as ready for review November 13, 2023 08:21

Skyfall-xzz and others added 4 commits November 14, 2023 21:14

add configs for supporting generative inference & merge datasets in t…

85368e3

…he same category

Merge branch 'open-compass:main' into main

63ba0ae

Merge branch 'main' of github.com:Skyfall-xzz/opencompass into main

8369b7f

add new features by other contributors

modify config filename to prompt version

536426a

Skyfall-xzz added 5 commits November 14, 2023 22:24

fix codes to meet pre-commit requirements

b5462c0

lint the code to meet pre-commit requirements

01401c2

Align Load_data Sourcecode Briefly

082d113

fix bugs

c9d56a1

reduce code redundancy

8634042

tonysy approved these changes Dec 20, 2023

View reviewed changes

tonysy merged commit b35d991 into open-compass:main Dec 20, 2023
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Add reasonbench dataset #577

[Feature] Add reasonbench dataset #577

Skyfall-xzz commented Nov 13, 2023

tonysy commented Nov 13, 2023

tonysy commented Nov 13, 2023

Skyfall-xzz commented Nov 14, 2023

tonysy left a comment

[Feature] Add reasonbench dataset #577

[Feature] Add reasonbench dataset #577

Conversation

Skyfall-xzz commented Nov 13, 2023

Motivation

Modification

BC-breaking (Optional)

Use cases (Optional)

tonysy commented Nov 13, 2023

tonysy commented Nov 13, 2023

Skyfall-xzz commented Nov 14, 2023

tonysy left a comment

Choose a reason for hiding this comment