Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trimmer is missing from search pipelines #532

Open
dhdaines opened this issue Jul 5, 2024 · 2 comments
Open

Trimmer is missing from search pipelines #532

dhdaines opened this issue Jul 5, 2024 · 2 comments

Comments

@dhdaines
Copy link

dhdaines commented Jul 5, 2024

Discovered in lunr.py: yeraydiazdiaz/lunr.py#151 - but the same issue (and a similar workaround) exists in lunr.js. As noted in the example it is actually a pretty serious problem:

const lunr = require("lunr");
const index = lunr(function() {
    this.field("title");
    this.field("body");
    this.add({
        title: "To be or not to be?",
        body: "That is the question!",
    });
});
// Should print something, but doesn't!
console.log(index.search("What is the question?"));
@dhdaines
Copy link
Author

dhdaines commented Jul 5, 2024

And just to help anyone who runs into this problem (unless a new release of lunr.js happens which appears unlikely) the workaround is simple (though not as clear as it is in Python...):

const lunr = require("lunr");
const index = lunr(function() {
    this.use(function(builder) {
        builder.searchPipeline.before(lunr.stemmer, lunr.trimmer);
    });
    this.field("title");
    this.field("body");
    this.add({
        title: "To be or not to be?",
        body: "That is the question!",
    });
});
console.log(index.search("What is the question?"));

@dhdaines dhdaines changed the title Trimmer and stop word filter are missing from search pipelines Trimmer is missing from search pipelines Jul 6, 2024
@dhdaines
Copy link
Author

dhdaines commented Jul 6, 2024

Edited the above because adding the stopword filter to the search pipeline actually isn't useful (probably why the code is the way it is?) - if the terms aren't in the index they just won't get found, obviously.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant