Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Load FastText models with specified encoding #1189

Merged
merged 2 commits into from
Mar 7, 2017

Conversation

jayantj
Copy link
Contributor

@jayantj jayantj commented Mar 7, 2017

Related to discussion in #1176

Allows user to specify a custom encoding themselves, since the FastText model file does not contain any encoding information, and FastText can create files with non-utf8 encodings in practice.

In the future, in case FastText decides to enforce utf8 (as mentioned in the readme), we may want to remove this parameter.

Also updates the header in the fasttext wrapper file.

@tmylk tmylk merged commit 65e5bc3 into piskvorky:develop Mar 7, 2017
@tmylk
Copy link
Contributor

tmylk commented Mar 7, 2017

Awesome, even with non-utf8 test!

pranaydeeps pushed a commit to pranaydeeps/gensim that referenced this pull request Mar 21, 2017
* fixes fasttext wrapper file header

* allows user specified encoding for loading fasttext models, corresponding tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants