-
Notifications
You must be signed in to change notification settings - Fork 548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Odd bug in results sorting #74
Comments
Firstly, thanks for a great bug report, having a test case like the one you have put together makes it so much easier to try and diagnose the issue. I'll try and describe what is happening here, hopefully it makes sense! When you search for lunr uses TF-IDF to rank how similar a document and a search term are. The IDF part of this, inverse document frequency, penalises tokens that are common in the corpus (the total collection of documents). In the case of your index the token There are measures in place to try and ensure that exact matches get a score boost, however this isn't a significant enough boost in your use case. This is an issue that has cropped up before, I think in your case the issue may be amplified by the small size of the documents. As for a solution, at the moment I'm not sure. There are a couple of issues like this that have prompted me to take a closer look at how scoring/ranking of search results is calculated, I don't have anything definitive yet but these are problems I'd like to solve. A potential work-around for you is to disable the IDF calculation, this can be achieved fairly simply (though via
I'll definitely keep this in mind though for upcoming releases, perhaps a simpler way to disable IDF checking, I'll have a think. |
+1 for highly educational discourse |
Thanks for this! Perhaps there might be a way to intelligently guess after indexing the data what the shape of the data is and enable or disable the IDF calculation accordingly as well as creating some kind of option to turn it on and off. I’ll have a try with your monkeyfreedompatch when I get back to the office tomorrow :) Very impressed with Lunr so far! |
The latest version of Lunr (v2) no longer automatically inserts wildcards at the end of queries. A search for "bread" will not return any results for "seafood breader" or "breadfruit". Wildcards are still supported, but must be explicit. To re-create the behaviour in this issue you would have to search for "bread*". So, it only took me 37 months to fix this issue, not bad! |
Hi,
I've got about 9000 food items in an array. I'm wanting to use lunr to match results and order them. So far so good.
Having tried it in Node, I'm getting an error. I thought I'd try it using the front end and I get the same error. Namely, searching for "bread" brings back "seafood breader" first, then "breadfruit" and then finally "bread". I'd expect "bread" to be first...
I've uploaded my test case including the data here: http://03sq.net/lunr-test/ as I'm not sure if I've done something obviously wrong or if this is a bug or how to debug it :)
The text was updated successfully, but these errors were encountered: