Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add HTTP#force_response_body_encoding for forcing response body encoding #17

Merged

Conversation

jeremyevans
Copy link
Contributor

This allows for the ability to opt-in to a method to force the
encoding of response bodies. By setting the accessor to a String
or Encoding instance, it will force the specified encoding.
Setting the value of true will try to detect the encoding of the
response body, either using the Content-Type header (assuming it
specifies charset) or by scanning for a tag in the document
that specifies the encoding. The default is false in which case
no forcing of encoding will be done (same as before the patch).

Implements [Feature #2567]
Implements [Feature #15517]

Co-authored-by: Yui Naruse naruse@ruby-lang.org

This allows for the ability to opt-in to a method to force the
encoding of response bodies.  By setting the accessor to a String
or Encoding instance, it will force the specified encoding.
Setting the value of true will try to detect the encoding of the
response body, either using the Content-Type header (assuming it
specifies charset) or by scanning for a <meta> tag in the document
that specifies the encoding.  The default is false in which case
no forcing of encoding will be done (same as before the patch).

Implements [Feature #2567]
Implements [Feature #15517]

Co-authored-by: Yui Naruse <naruse@ruby-lang.org>
# :nodoc:
def sniff_encoding(str, encoding=nil)
# the encoding sniffing algorithm
# http://www.w3.org/TR/html5/parsing.html#determining-the-character-encoding
Copy link
Member

@nurse nurse Apr 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# http://www.w3.org/TR/html5/parsing.html#determining-the-character-encoding
# https://html.spec.whatwg.org/multipage/parsing.html#encoding-sniffing-algorithm

Algorithm also should be updated if needed

# the response body encoding. If String or Encoding, uses the specified
# encoding.
attr_accessor :force_response_body_encoding

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name force_encoding is intended to indicate that the method is a dirty, low-level operation. On the other hand, I think this method is highly recommended. The name should reflect the difference.

Also maybe the given String should be verified when it was set instead of the time when it calls force_encoding.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added commits to fix both of these issues. I can update NEWS for ruby after this is merged.

@nurse
Copy link
Member

nurse commented Apr 1, 2022

Generally looks good. Note that this also should update NEWS.

@jeremyevans jeremyevans requested a review from nurse April 7, 2022 22:28
@jeremyevans jeremyevans merged commit 6233e6b into ruby:master Apr 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants