-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alternate hosting for hedwig-data #61
Comments
I can check in data directly into that repo for you. I think ~5GB is fine... Point me to what you want checked in. |
I just added the pre-trained BERT weights to that repo, but it's pretty slow. For instance, running |
Does this use hgf? Why not do the same as here? https://huggingface.co/castorini |
It does use hgf. I guess this is something that is in the pipeline #56 but I was looking for something more immediate |
Now that you've checked it in, |
Right now I am looking to eliminate these extra steps:
|
I was trying to get hedwig running and then realized that I need gensim to run |
Fixed in #62 |
The dataset repo https://git.uwaterloo.ca/jimmylin/hedwig-data isn't ideal and requires the user to extract the embeddings and process them with a python script. Do we have any alternate ways to host ~5 GB of data that would make it easier for others to replicate our results out of the box?
The text was updated successfully, but these errors were encountered: