Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Was there a change introduced in 0.2.12? #65

Closed
zhiburt opened this issue Jul 13, 2024 · 3 comments
Closed

Was there a change introduced in 0.2.12? #65

zhiburt opened this issue Jul 13, 2024 · 3 comments

Comments

@zhiburt
Copy link

zhiburt commented Jul 13, 2024

Hi there,
Thanks for such a valuable crate.

I just noticed recently this change, and wonder whether it's all right.
Sorry in case I have not got to read some docs which would explain it.
But could you help me a bit?

So there's a simple test which works differently on 0.2.11 and 0.2.12|0.2.13.

let text = "\u{200d}\u{fe0f}";
// on 0.2.11
assert_eq!(unicode_width::UnicodeWidthStr::width(text), 1);
// on 0.2.12
assert_eq!(unicode_width::UnicodeWidthStr::width(text), 2);

As you might already guessed the different is in character width.

But what's interesting if we treat it as a char it will be identical across versions.

assert_eq!(unicode_width::UnicodeWidthChar::width('♀'), Some(1));
assert_eq!(unicode_width::UnicodeWidthStr::width("♀"), 1);

Take care.
Have a great weekend .

@Manishearth
Copy link
Member

Manishearth commented Jul 13, 2024

This is intentional to bring it more in line with the specification, in this case it has to do with emoji rendering.

Please continue to expect such changes as we tweak the algorithm. This is a best-effort character width algorithm and should not be relied on for some precise result: there is no way to get One True Answer for terminal width of a string without knowing the fonts and rendering stack used. All we can do is estimate.

cc @Jules-Bertholet on the str vs char difference

@Jules-Bertholet
Copy link
Contributor

cc @Jules-Bertholet on the str vs char difference

A one-character string always has the same width as that single character, there is a unit test that verifies this. The '\u{fe0f}' is a variation selector that modifies the preceding character, turning it into an emoji and changing its width from the default. The full width rules, with links to relevant Unicode standards, are in rustdoc.

@zhiburt
Copy link
Author

zhiburt commented Jul 13, 2024

Got you

Thanks once again.
Take care

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants