Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ascii_only? flag in strio_write #77

Merged
merged 1 commit into from
Jan 18, 2024
Merged

Conversation

tompng
Copy link
Member

@tompng tompng commented Jan 15, 2024

Fixes https://bugs.ruby-lang.org/issues/20185
Followup of #79

Looks like rb_str_resize is changed in some ruby version

rb_str_resize(string, shorter) // clear ENC_CODERANGE in some case
rb_str_resize(string, longer) // does not clear ENC_CODERANGE anymore
// rb_str_resize in string.c
if (slen > len && ENC_CODERANGE(str) != ENC_CODERANGE_7BIT) {
  ENC_CODERANGE_CLEAR(str);
}

I think this change is based on an assumption that appending null bytes will not change flag ascii_only?.

strio_extend will make the string longer if needed, and update the flags correctly for appending null bytes.
Before memmove, we need to rb_str_modify because updated flags are not updated for memmove.

@kou
Copy link
Member

kou commented Jan 16, 2024

Wow.
ruby/ruby@b0b9f72 may be related.

test/stringio/test_stringio.rb Outdated Show resolved Hide resolved
test/stringio/test_stringio.rb Outdated Show resolved Hide resolved
s = StringIO.new('abcd')
s.seek(pos)
s.write("\xff\xff\xff")
refute(s.string.ascii_only?)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
refute(s.string.ascii_only?)
assert_operator(s.string, :ascii_only?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to refute_predicate(s.string, :ascii_only?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry.
Could you use assert_not_predicate instead? I don't want to use refute_* as much as possible because assert_not_ is straightforward for non-native English speaker.

@kou
Copy link
Member

kou commented Jan 17, 2024

Oh, @nobu worked on this: #79

@tompng tompng changed the title Fix ascii_only? flag in strio_write Fix ascii_only? valid_encoding? flag in strio_write Jan 17, 2024
@tompng tompng changed the title Fix ascii_only? valid_encoding? flag in strio_write Fix ascii_only? flag in strio_write Jan 17, 2024
@tompng
Copy link
Member Author

tompng commented Jan 17, 2024

I add a test case

    s = StringIO.new("\u{3042}")
    s.rewind
    assert_not_predicate(s.string, :ascii_only?)
    s.write('aaaa')
    assert_predicate(s.string, :ascii_only?)

that fails in master (after #79)

Writing string in arbitary position(when ptr->pos != olen) can change coderange in many variation, and I think rb_str_modify is always needed.

Other variation:
writing "\x00".force_encoding('utf-16be') twice can make string invalid to valid
s.write "あ"; s.rewind; s.write('a'); s.string #=> "a\x81\x82" will break encoding.

@kou kou merged commit b31a538 into ruby:master Jan 18, 2024
28 checks passed
@kou
Copy link
Member

kou commented Jan 18, 2024

Thanks!

matzbot pushed a commit to ruby/ruby that referenced this pull request Jan 18, 2024
(ruby/stringio#77)

Followup of #79

`rb_str_resize()` was changed by b0b9f72  .

```c
rb_str_resize(string, shorter) // clear ENC_CODERANGE in some case
rb_str_resize(string, longer) // does not clear ENC_CODERANGE anymore
```

```c
// rb_str_resize in string.c
if (slen > len && ENC_CODERANGE(str) != ENC_CODERANGE_7BIT) {
  ENC_CODERANGE_CLEAR(str);
}
```

I think this change is based on an assumption that appending null bytes
will not change flag `ascii_only?`.

`strio_extend()` will make the string longer if needed, and update the
flags correctly for appending null bytes.
Before `memmove()`, we need to `rb_str_modify()` because updated flags are not
updated for `memmove()`.

ruby/stringio@b31a538576
@tompng tompng deleted the write_ascii_fix branch January 18, 2024 05:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants