Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lunr-languages/lunr.fr.js fails to find common words like "équipement" #71

Open
DavidBruant opened this issue Mar 23, 2021 · 3 comments

Comments

@DavidBruant
Copy link

Test case:

import lunr from "lunr"
import stemmerSupport from 'lunr-languages/lunr.stemmer.support.js'
import lunrfr from 'lunr-languages/lunr.fr.js'

stemmerSupport(lunr)
lunrfr(lunr)

const docs = [
    {
        text : "équipement, barrage",
        id: '1'
    },
    {
        text : "rivière",
        id: '2'
    }
]

const index = lunr(function () {
    this.field('text')
    this.ref('id')

    for(const doc of docs){
        this.add(doc)
    }
})

console.log('résultats pour "équipement"', index.search('équipement'))
console.log('résultats pour "barrage"', index.search('barrage'))
console.log('résultats pour "rivière"', index.search('rivière'))

All 3 console.log should return a result, but the first one does not

@DavidBruant
Copy link
Author

I haven't taken the time to be sure, but i believe this is related to #68

@DavidBruant
Copy link
Author

The workaround i have found is to remove all accents to the texts i index and from the string i search using this function

function removeAccents(str){
    return str.normalize("NFD").replace(/[\u0300-\u036f]/g, "");
}

It's inconvenient, but it works until lunr-languages/lunr.fr.js is fixed

@dhdaines
Copy link

dhdaines commented Jul 6, 2024

The language support plugins in general don't do folding, which might be by design. You can do it separately with https://www.npmjs.com/package/lunr-folding (quick but possibly buggy) or by adding your own pipeline function using unidecode (more complete): https://github.com/dhdaines/lunr.py/blob/fix_skip_docs/docs/languages.md#folding-to-ascii

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants