Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding support for Arabic benchmarks : AlGhafa benchmarking suite #95

Merged
merged 29 commits into from
Mar 27, 2024

Conversation

alielfilali01
Copy link
Contributor

AlGhafa benchmarking suite, consist of 11 dataset presented in this paper and hosted in this repo in the Hub

@clefourrier
Copy link
Member

Do you want us to wait for Alghafa 2 to merge this?

@alielfilali01
Copy link
Contributor Author

Do you want us to wait for Alghafa 2 to merge this?

Yes please @clefourrier , i will take some time before Saturday to add the new version of the benchmark

@clefourrier
Copy link
Member

No hurries, take your time!

@alielfilali01 alielfilali01 marked this pull request as draft March 6, 2024 20:54
@alielfilali01 alielfilali01 marked this pull request as ready for review March 8, 2024 18:35
@alielfilali01
Copy link
Contributor Author

Hello @clefourrier , I believe this PR is ready to be merged

alielfilali01 and others added 12 commits March 11, 2024 23:49
Add Support for the AlGhafa benchmarking suite
Adding support to the AlGhafa benchmarking suite
remove translated from AlGhafa
This file now contains all the arabic tasks including tasks not present in OALL_tasks.txt
Add support for ALGHAFA TRANSLATED  tasks
Add support to AlGhafa Translated benchmark suite (11 subsets)
minor fixes flagged by the pre-commit hook
forgot to remove 
`community|Alghafa:multiple_choice_copa_translated_task|5|1`
& `community|Alghafa:multiple_choice_openbookqa_translated_task|5|1` from ALGHAFA NATIVE
forgot to remove 
`community|Alghafa:multiple_choice_copa_translated_task|5|1`
& `community|Alghafa:multiple_choice_openbookqa_translated_task|5|1` from ALGHAFA NATIVE
Copy link
Member

@clefourrier clefourrier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but you need to homogeneize your naming:

  • Prompt names such as boolq_function will be unclear long term. For such functions, you could either use boolq_prompt_arabic or just boolq_arabic. (You need to specify the language since there is already a boolq prompt function by default.)
  • You also need to homogeneize Alghafa, which exists with several different casings, and fit it to Python style casing. For the prompt fonction, I'd keep it as alghafa_prompt or alghafa, for the class, CustomAlGhafaTask, and here for the name I'd keep it lower case
    [CustomAlGhafaTask(name=f"alghafa:{subset}", hf_subset=subset) for subset in ALGHAFA_SUBSETS]

community_tasks/arabic_evals.py Outdated Show resolved Hide resolved
community_tasks/arabic_evals.py Outdated Show resolved Hide resolved
community_tasks/arabic_evals.py Outdated Show resolved Hide resolved
community_tasks/arabic_evals.py Outdated Show resolved Hide resolved
auto_commit_fixes.sh Outdated Show resolved Hide resolved
alielfilali01 and others added 8 commits March 12, 2024 10:00
homogeneize naming according to the following comments :

####
Prompt names such as boolq_function will be unclear long term. For such functions, you could either use boolq_prompt_arabic or just boolq_arabic. (You need to specify the language since there is already a boolq prompt function by default.)

You also need to homogeneize Alghafa, which exists with several different casings, and fit it to Python style casing. For the prompt fonction, I'd keep it as alghafa_prompt or alghafa, for the class, CustomAlGhafaTask, and here for the name I'd keep it lower case
[CustomAlGhafaTask(name=f"alghafa:{subset}", hf_subset=subset) for subset in ALGHAFA_SUBSETS]
####
homogeneize AlGhafa naming : `Alghafa` to `alghafa`
homogeneize AlGhafa naming : `Alghafa` to `alghafa`
Co-authored-by: Clémentine Fourrier <22726840+clefourrier@users.noreply.github.com>
Co-authored-by: Clémentine Fourrier <22726840+clefourrier@users.noreply.github.com>
Co-authored-by: Clémentine Fourrier <22726840+clefourrier@users.noreply.github.com>
Co-authored-by: Clémentine Fourrier <22726840+clefourrier@users.noreply.github.com>
Copy link
Member

@clefourrier clefourrier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi. This needs a bit more changes, I tried to make what is requested clearer.
I also added comments about tasks level instructions that I had missed previously

alielfilali01 and others added 4 commits March 14, 2024 13:22
use the standard camel casing for classes:

(remove) class CustomALGHAFATask(LightevalTaskConfig):

(add) class CustomAlGhafaTask(LightevalTaskConfig):

Co-authored-by: Clémentine Fourrier <22726840+clefourrier@users.noreply.github.com>
Fixes based on Clementine's comments
@alielfilali01
Copy link
Contributor Author

@clefourrier I hope this answers to your comments, plz feel free to ping me if i missed anything (i have a tendency to forget 😅)
Again thanks a lot for the efforts 🤗

@clefourrier
Copy link
Member

Looks better thank you!
Do you have some reference models and scores against which I could check the implementation?
Or did you check it, and against which models? :)

@alielfilali01
Copy link
Contributor Author

Looks better thank you! Do you have some reference models and scores against which I could check the implementation? Or did you check it, and against which models? :)

Yes @clefourrier , I tested gpt2 using --max_samples=1 and everything was fine and I believe Hamza is on it to test on bigger models and push the results to the hub for further inspection. I'll update you as soon as i hear back from Hamza

@clefourrier
Copy link
Member

Sounds good, feel free to ping me whenever :)

@clefourrier clefourrier self-requested a review March 27, 2024 06:20
Copy link
Member

@clefourrier clefourrier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the edits and tests!

@clefourrier clefourrier merged commit ef631cf into huggingface:main Mar 27, 2024
2 checks passed
@thevexx
Copy link

thevexx commented Apr 12, 2024

AlGhafa eval dataset is no longer available on Huggingface, any alternatives ?

@alielfilali01
Copy link
Contributor Author

AlGhafa eval dataset is no longer available on Huggingface, any alternatives ?

Hi there, Can you plz provide more context ? I have checked the eval code and it seems it works fine

@thevexx
Copy link

thevexx commented Apr 13, 2024

Hi there, Can you plz provide more context ? I have checked the eval code and it seems it works fine

Hi, yesterday the datasets disappeared from the OALL Huggingface account, now i can see them, thanks

@alielfilali01
Copy link
Contributor Author

Hi there, Can you plz provide more context ? I have checked the eval code and it seems it works fine

Hi, yesterday the datasets disappeared from the OALL Huggingface account, now i can see them, thanks

OOH I see, i had to make the datasets private for about 20 min yesterday cuz i was testing something, what a coincidence you checked it at the same time 😅
sorry for the inconvenience 🤗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants