Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix CLI utf8 issue #1080

Merged
merged 1 commit into from
Dec 6, 2022
Merged

Fix CLI utf8 issue #1080

merged 1 commit into from
Dec 6, 2022

Conversation

aziz-mu
Copy link
Contributor

@aziz-mu aziz-mu commented Nov 29, 2022

This PR fixes issue #1042, and refactors linenoise.cpp for better utf8 support in the future.

In detail:

  • Added a chars tracker to linenoisestate. len now keeps track of the number of single-byte characters in the buffer, while chars keeps track of the actual number of utf8 characters.
  • Added a function getBufCharPos, which converts from the cursor's position to the data position in the buffer.
  • linenoiseEditInsert still takes in one byte at a time, but it doesn't write it until a full utf-8 character is typed, to prevent it from typing invalid utf-8 characters.

Copy link
Collaborator

@acquamarin acquamarin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cursor is not in the correct position after i type some utf-8 characters.
You can try copy this utf8 string to the shell, and the cursor doesn't appear in the correct position

tools/shell/embedded_shell.cpp Outdated Show resolved Hide resolved
tools/shell/embedded_shell.cpp Outdated Show resolved Hide resolved
Copy link
Collaborator

@acquamarin acquamarin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got seg fault for this utf-8 string:

z473chen@ubuntu:/data0/kuzu/build/release/tools/shell$ ./kuzu_shell  -i dsad
Opened the database at path: dsad
Enter ":help" for usage hints.
kuzu> 大苏打Segmentation fault (core dumped)

Copy link
Collaborator

@acquamarin acquamarin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is still a bug when the user types a very long utf-8 string in the shell:

z473chen@ubuntu:/data0/kuzu/build/release/tools/shell$ ./kuzu_shell  -i 12451
Opened the database at path: 12451
Enter ":help" for usage hints.
kuzu> 顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶
kuzu> 顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶
kuzu> 顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶顶

The text appears in multiple lines rather than a single line

@aziz-mu aziz-mu merged commit b739582 into master Dec 6, 2022
@aziz-mu aziz-mu deleted the utf8-fix branch December 6, 2022 17:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants