-
Notifications
You must be signed in to change notification settings - Fork 548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strange behaviour with word "general" #377
Comments
The same occurs with "introduction" ... it works until the "T" in "introducTion". Am I missing a basic concept? |
@tknuth When you say you changed "general" to "generol", is this in the document, or in the query? The reason for this comes down to the interaction between wildcards and stemming - judging from a little test I ran, the stemmer removes the Let's back up a bit and talk about things at a higher level - what are you trying to use lunr to do? It might be possible with some tweaking to the index builder, or it could be that lunr isn't a good fit for your use case. Let's have a discussion to see if lunr works for you! |
@hoelzro Thank you so much for your effort! So it was just a lack of understanding of how lunr works internally. I removed the stemmer:
Now lunr works as expected, which makes me happy. However, there is obviously value in what the stemmer does. So I would ideally want to be able to match words exactly in addition to the standard behaviour of lunr. In other words, I could create two lunr search indices, one with a stemmer and one without. Then I could merge the result lists into one. Would that have negative side effects besides reduced performance? Regarding your higher-level question: I would like to use lunr for search on a web page, and the problem arose when I wanted to allow the user to search for parts of the table of contents as well. I know that some of my users like copy and pasting terms, and it is crucial that you can search for complete words. That's why the issue came up. To sum up:
Thank you so much for your time and effort. I appreciate that! |
@tknuth I think your approach of having two separate indices would work, but as you point out, you could run into performance issues. Another potential issue is the ranking of search results - I'm not exactly sure how that would shake out when merging result lists. @olivernn and I were having a discussion about this issue on one of my projects using lunr here - I plan on experimenting with different approaches in the near future. I'll follow up here if I feel like I find a good solution! |
I've seen this error as well and the solution that is often mentioned is to remove stemmer.
This will result in stemming the search terms and then will look for "bill" as well as billed*. If you indexed "billing" it will work for "billing", "bill", "billed", etc |
I am getting a little crazy because I do not understand the following behaviour. I have a couple of objects like these:
When I search for "general", I get results only as long as I type up to "gener". Typing "genera" or "general" removes all matching results.
The word "general" is not part of the stopwordfilter.js, as far as I understand. So what is causing this? Changing it to "generol" works just fine. It seems to be this specific phrase, but I could not find any hint in the repo.
The text was updated successfully, but these errors were encountered: