Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Esperanto Support #62

Open
Fierthraix opened this issue Feb 25, 2020 · 0 comments
Open

Esperanto Support #62

Fierthraix opened this issue Feb 25, 2020 · 0 comments

Comments

@Fierthraix
Copy link

Hello, I would like to add support for the Esperanto language, as several projects downstream I use depend on lunr-languages. It is a constructed languaged invented in 1887 by Dr. L.L. Zamenhof, and has over 2-million speakers worldwide.

Fortunately due to the extreme regularity of the language (it only has 16 rules), implementing this should be a lot easier than for other languages.

Advice Needed:

I don't normally work with JavaScript, so I was wondering if anyone involved with the project can help me out with a few things:

  • Does the stop-words function run before the stemmer? It would greatly reduce the burden if stop-words are filtered out before they get to the stemmer. Otherwise, I will basically wind up having to reimplement the stop-words list again in the stemmer, as most of the stop-words are grammatical prepositions and the like that have irregular endings.

  • Many other languages have very complicated hundred-line stemmer functions, but in Esperanto, once you filter the special grammatical words, every word ends with either: -is, -as, -os, -us, -u, -e, -en, -a, -an -aj, -ajn, -o, -on, -oj, or -ojn. With that said, my stemmer function can be as simple as just returning a string with the end cut off (this always results in a valid word root). I wasn't sure if I needed to use the SnowballFunction or not.

I'm currently working on Esperanto support on my fork if anyone has any advice, or wants to point out any obvious JS flaws I missted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant