crash on showing specially crafted string #489

alexanderkjall · 2020-08-20T08:13:06Z

First: is this a bug report? A suggestion? Or asking for help?

bug report

Problem description

My fuzzing found an panic inside cursive itself when displaying a string supplied from an untrusted source:

thread '<unnamed>' panicked at 'byte index 1 is not a char boundary; it is inside '\u{85}' (bytes 0..2) of `
-`', /home/capitol/projects/cursive/cursive-core/src/utils/lines/spans/segment.rs:24:24

Can be replicated with with program:

use cursive::views::{Dialog, TextView};
use std::str;

fn main() {
    let d: Vec<u8> = vec![194, 133, 45, 127, 29, 127, 127];
    let str = str::from_utf8(&d);
    if let Ok(s) = str {
        // Creates the cursive root - required for every application.
        let mut siv = cursive::default();

        // Creates a dialog with a single "Quit" button
        siv.add_layer(Dialog::around(TextView::new(s))
            .title("Cursive")
            .button("Quit", |s| s.quit()));

        siv.run();
    } else {
        println!("not valid utf8");
    }
}

Environment

Operating system: linux (ubuntu 20.04)
Backend used: ncurses (the default one) but looks like it applies to all of them
Current locale (run locale in a terminal).

$ locale
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC=nb_NO.UTF-8
LC_TIME=nb_NO.UTF-8
LC_COLLATE="en_US.UTF-8"
LC_MONETARY=nb_NO.UTF-8
LC_MESSAGES="en_US.UTF-8"
LC_PAPER=nb_NO.UTF-8
LC_NAME=nb_NO.UTF-8
LC_ADDRESS=nb_NO.UTF-8
LC_TELEPHONE=nb_NO.UTF-8
LC_MEASUREMENT=nb_NO.UTF-8
LC_IDENTIFICATION=nb_NO.UTF-8
LC_ALL=

Cursive version (from crates.io, from git, ...): replicated in git an 0.15

The text was updated successfully, but these errors were encountered:

gyscos · 2020-08-20T16:26:32Z

Hi, and thank you for the detailed report! Seems like we're doing something wrong there, will investigate.

gyscos · 2020-08-26T22:45:45Z

Ah yeah we totally assume that any line-breaking character is 1-byte in utf-8... smart...

I pushed a tentative fix (and added this as a text case). Though now I see a new problem: ncurses behaves differently from other backends like termion/crossterm when attempting to print control characters like \u{7f} or \u{1d}:

Ncurses tries to print something:

Crossterm just ignores it:

UnicodeWidth returns a 0-width for these control characters, so the correct behaviour would be not to print them.

But rather than patching ncurses to not print them, it may be easier to sanitize the user input and remove these characters from any provided string?

On the other hand, there is a longer-term plan to bring something like agavrilov/cursive_buffered_backend in cursive itself, keeping an internal character grid before flushing to the backend. If we had that, it would be easy to ignore any zero-width grapheme.

alexanderkjall · 2020-08-27T06:05:59Z

I can't really think of any use case where you would want to print those kind of control characters "raw", maybe if want to build a hex editor, but even then you need to sanitize/replace unprintable characters with something else.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

crash on showing specially crafted string #489

crash on showing specially crafted string #489

alexanderkjall commented Aug 20, 2020

gyscos commented Aug 20, 2020

gyscos commented Aug 26, 2020

alexanderkjall commented Aug 27, 2020

crash on showing specially crafted string #489

crash on showing specially crafted string #489

Comments

alexanderkjall commented Aug 20, 2020

Problem description

Environment

gyscos commented Aug 20, 2020

gyscos commented Aug 26, 2020

alexanderkjall commented Aug 27, 2020