Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When I type in Chinese characters, the browser cannot count #37

Open
agreadname opened this issue Oct 21, 2019 · 3 comments
Open

When I type in Chinese characters, the browser cannot count #37

agreadname opened this issue Oct 21, 2019 · 3 comments
Assignees
Labels

Comments

@agreadname
Copy link

It's so funny and useful tools,I want to share to more friend,so I hope you can resolve this issues

@maebert
Copy link
Owner

maebert commented Oct 23, 2019

Thanks for bringing this up!

To be honest I'm woefully ignorant about CJK input methods, but this is a great opportunity to learn. I'll look into it when I have a bit of time!

@maebert maebert self-assigned this Oct 23, 2019
@maebert maebert added the bug label Oct 23, 2019
@MosakujiHokuto
Copy link

As a Chinese IME user, I would like to provide some insight into the cause of this bug.

First, some knowledge about how CJK input works:

CJK languages have thousands of commonly used characters, but our keyboard has only about 100 keys available. Thus, we use something called IME (Input Method Editor), which is a software responsible for mapping a stream of keystrokes into another stream of characters.

Basically, it works like this:

  1. When an IME is enabled, it will intercept keystrokes made by the user, and store them input a buffer. The original keystrokes may or may not reach the destinate input control, depend on IME used.

  2. The IME find all strings that can be mapped to the keystrokes, and display them in a candidate list for the user to choose, as most of the time more than one string can be mapped to the input.

  3. When the desired candidate has been chosen, the IME will submit the candidate into destinate input control all at once.

Be aware that we usually input on a string-by-string basis, rather than char-by-char, which is different from alphabet-based language users.

So the problem is this application resets its timer in the onKeyDown callback, which is unfortunately very unreliable to CJK editing (since keystrokes are handled by the IME rather than input control).

I think if we reset the timer in the onChange callback would improve the situation a little bit, but I don't think that is the ideal solution since more problems will arise:

  • The reason why the original code handles timer reset in onKeyStroke callback is unknown to me, so this solution may be infeasible

  • For some IME (for example, those without preedit feature), the onChange event is fired only when the result is submitted, and the interval between submission may be longer than 5s even when editing continuously, since the user may be used to submit a long sentence all at once, and finding the desired candidate may take a long time.

So I don't really have an idea now. I don't know how to capture keystrokes when there is an IME active, and I'm not even sure such a method even exists. Maybe the best we can do now is to add some configurable workarounds for CJK users?

@whuter
Copy link

whuter commented Jun 6, 2020

@AraragiHokuto We could use the compositionupdate event, which can capture the inputs in IME.
Another question is that it's unsuitable to count the words by white spaces in Chinese articles, and I would like to fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants