Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for some Unicode characters in citation keys #6938

Merged
merged 4 commits into from
Sep 26, 2020

Conversation

k3KAW8Pnf7mkmdSMPHz27
Copy link
Sponsor Member

@k3KAW8Pnf7mkmdSMPHz27 k3KAW8Pnf7mkmdSMPHz27 commented Sep 23, 2020

Fixes #6583 . Some unicode characters can be encoded in multiple ways, and the mapping that StringUtil#replaceSpecialCharacters relies on does not contain all cases. The proposed solution uses NFC to re-encode the characters so that these characters can be found.
There exists more information on Unicode normalization in the Java API.

My subjective opinion is that most people expect Unicode to work similar to NFC, i.e., if characters looks the same, it is likely that they are equivalent. Hence, if someone debugs issues in the UNICODE_CHAR_MAP, they will expect NFC.
A more holistic approach should likely start with the compatibility equivalence, which will require larger changes, and there does not seem to be any bugs/issues that requires these larger changes.

  • Change in CHANGELOG.md described (if applicable)
  • Tests created for changes (if applicable)
  • Manually tested changed features in running JabRef (always required)
  • Screenshots added in PR description (for UI changes)
  • Checked documentation: Is the information available and up to date? If not created an issue at https://github.com/JabRef/user-documentation/issues or, even better, submitted a pull request to the documentation repository.

@k3KAW8Pnf7mkmdSMPHz27 k3KAW8Pnf7mkmdSMPHz27 changed the title [WIP] Fix for issue 6583 [WIP] Fix for some Unicode characters in citation keys Sep 24, 2020
@k3KAW8Pnf7mkmdSMPHz27 k3KAW8Pnf7mkmdSMPHz27 changed the title [WIP] Fix for some Unicode characters in citation keys Fix for some Unicode characters in citation keys Sep 24, 2020
@k3KAW8Pnf7mkmdSMPHz27 k3KAW8Pnf7mkmdSMPHz27 marked this pull request as ready for review September 24, 2020 15:41
Copy link
Member

@calixtus calixtus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work here, one question remaining...

Copy link
Member

@Siedlerchr Siedlerchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I was not aware of the Normalizhe/Unicode stuff methods. Never heard of them before

@Siedlerchr Siedlerchr added the status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers label Sep 25, 2020
@k3KAW8Pnf7mkmdSMPHz27
Copy link
Sponsor Member Author

@Siedlerchr in my opinion it is a mess best avoided if possible X)
I believe NFC is how most people expect Unicode to work (which is why I am using it here), I'll add some more details to the top part of the PR in case someone needs to patch this patch later.

Copy link
Member

@koppor koppor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the work! LGTM.

@koppor koppor merged commit 47edbbd into JabRef:master Sep 26, 2020
Siedlerchr added a commit that referenced this pull request Sep 26, 2020
* upstream/master: (55 commits)
  Rename menus citation style in preview style (#6899)
  Fix for some Unicode characters in citation keys (#6938)
  Add missing authors
  Fix a fetcher test for the ShortDOIService (#6945)
  Fixes Shared Database: Changes filtering in CoarseChangeFilter to attribute property (#6868)
  Changed default value of "search and store files relative to bibtex file" to true (#6928)
  Replace comment by just a failure (#6943)
  Fix: in entry types editor selected field is not removed after first click  (#6941)
  Fix remove actions for entry types in the editor (#6933)
  Always use Java 15 (#6929)
  Update DevDocs: workaround for issues with local openjfx maven libraries (#6931)
  Fixes bugs in the `regex` cite key pattern modifier (#6893)
  Add missing author
  Readability for citation key patterns (#6706)
  Add new author
  Reset to master and add default case to switch (#6847)
  Bump mockito-core from 3.5.10 to 3.5.11 (#6924)
  Bump byte-buddy-parent from 1.10.14 to 1.10.15 (#6923)
  Bump org.beryx.jlink from 2.21.4 to 2.22.0 (#6925)
  Bump xmpbox from 2.0.20 to 2.0.21 (#6926)
  ...

# Conflicts:
#	src/main/java/org/jabref/logic/util/DelayTaskThrottler.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bibtex citekey has non-ASCII letters
4 participants