Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems to order Special characteres (AÂÃâã) #355

Open
carlamartinsab opened this issue Apr 26, 2016 · 12 comments
Open

Problems to order Special characteres (AÂÃâã) #355

carlamartinsab opened this issue Apr 26, 2016 · 12 comments
Labels

Comments

@carlamartinsab
Copy link

In portuguese idiom there are words that start with Á (Água or água) and  (Ângulo or ângulo). IQVOC does not understand that both words stars with A and should appear in the alphabetical ordered view in the beginning, with the others words starting with A. These words appear in the end, after the Z.
Maybe it happens also in german, that has words with "¨".
How can I solve this?

Thank you
Carla.

@mjansing
Copy link
Contributor

Hi @carlamartinsab,

yes this looks like a bug. I tested this behaviour with a german thesaurus. Alphabetical concept listing also shows links for concepts starting with german umlauts (e.g. Ä, Ö, etc.) but these listing are always empty even if there are concept starting with such a character are available.

one more thing: Can you test iQvoc's search for concepts starting with one of your special characters? In my case search couldn't find any concepts even if there are some.

@mjansing mjansing added the bug label Apr 27, 2016
@carlamartinsab
Copy link
Author

Hi @mjansing

Exactly, when searching by words starting with "A", the word "Água" and "Ângulo" are not returned. The same occurs when searching for "Á" and it not returns "A" neither "á".
This behaviour is not acceptable for português idiom users, as there are many terms with accentuation.

Thanks
Carla

@mjansing
Copy link
Contributor

@carlamartinsab thank you for investigation. That's also a Problem in german. I'll try to fix that issue soon.

@carlamartinsab
Copy link
Author

@mjansing OK! Thank you in advance!

@mjansing
Copy link
Contributor

mjansing commented Apr 29, 2016

Should be fixed with 426ce2c. Feel free to repoen if there any issues.

We'll release a new iqvoc version to rubygems soon.

@carlamartinsab
Copy link
Author

We've applied the fix available for the issue ##, but it didn't work as we expected.
Apparently ordering does not change and the "Á" "Â""Ó" starting words appear in the end of the alphabethical list. And when searching for these words I cannot find anymore...

Below, we present some quick examples for illustrate what is happening.
Concepts preferred labels:
Amostra de área
Área de contato
Área de influência
Balanceamento

#1
Seached Term: "Área"
Current Results: "Amostra de área"
Expected Results: "Amostra de área", "Área de contato", "Área de influência"

#2
Seached Term: "área"
Current Results: "Amostra de área"
Expected Results: "Amostra de área", "Área de contato", "Área de influência"

#3
Seached Term: "area"
Current Results:
Expected Results: "Amostra de área", "Área de contato", "Área de influência"

We are also facing some problems in sorting mechanism.

Current result:
Amostra de área
Balanceamento
Área de contato
Área de influência

Expected result:
Amostra de área
Área de contato
Área de influência
Balanceamento

Thanks for your help

@mjansing mjansing reopened this May 9, 2016
@mjansing
Copy link
Contributor

mjansing commented May 9, 2016

I'll take a look at it.

@rbvictor
Copy link

rbvictor commented Jun 22, 2016

Hi @mjansing,
I was looking at the issue #351 and an idea occured to me. Is it possible to not only ignore the case, but also ignore the accents (special characters) during search by removing them from both sides of the comparison in self.by_query_value(query) of base.rb .

Something like
where(["UNACCENT(LOWER(#{table_name}.value)) LIKE ?", I18n.transliterate(query.mb_chars.downcase.to_s)])
?

The problem is that the function UNACCENT is specific to PostgreSQL databases. I don't know how to adapt it for other DBs. However, do you think it is possible to do something like this?

@mjansing
Copy link
Contributor

Hmm I'm not sure if I fully understand this issue. Which "view" is affected by this issue (hierarchical concepts or alphabetical concepts?

With current version of iQvoc and some sample data it looks like this:

hierarchical concepts

screen shot 2016-06-23 at 09 23 53

alphabetical concepts

screen shot 2016-06-23 at 09 23 48

@carlamartinsab
Copy link
Author

Hello @mjansing,

@rbvictor seems to be right.
When searching, if we have two terms like "Água Tratada" and "Agua não Tratada" and I search for "Agua", the first term is not returned. I would see only "Agua não Tratada", because "A" and "Á" are treated as different alphabet's letters.

So, if the system just ignore the accents (and of course the case), it seems to solve the problem.

Thank you

@rbvictor
Copy link

I think there are 2 issues in 1:

  1. Search functionality does not ignore accents in latin characters:
    For example, when we search for the correct word "água", we have the expected results, but when we search for "agua" without the acute accent, there is no result. I think both "água" and "agua" are supposed to be regarded the same way and have the same results, similarly to other search tools in general. Maybe the same thing may happen in German with characters like "ä", "ö", "ü" etc.
  2. Alphabetical sort does not ignore accents in latin characters, either:
    Currently I am using the version before fix 426ce2c from this issue. Labels beginning with these special characters appear in the end of the list. This problem happens throughout the pages where there is this sorting functionality:
  • alphabetical concepts
  • hierarchical concepts
  • order in search results
  • etc.

Do you think it would be better to address these two issues separately?

@carlamartinsab
Copy link
Author

I agree with that. And the (1)Search functionality is more critical.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants