Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoding of responses is not handled correctly #170

Open
ngtban opened this issue Feb 17, 2022 · 0 comments
Open

Encoding of responses is not handled correctly #170

ngtban opened this issue Feb 17, 2022 · 0 comments

Comments

@ngtban
Copy link

ngtban commented Feb 17, 2022

The response returned from Amazon MWS is encoded in ISO-8859-1, according to their documentation. However the gem currently does not take this into account and is preserving the encoding of the responses from excon. This will cause an issue later, either when parsing or when persisting the parsed data.

In one case where the chosen parser is ox and the response body happens to contain characters that are valid in ISO-8859-1 but make the strings in the parsed data invalid UTF-8 strings then this will cause an issue when the parsed data is persisted.

As an example of the case above, if we retrieve order data from Amazon and the customer's name happens to be [0x4C, 0xE2, 0x6D].pack('c*') or Lâm in ISO-8859-1, and if we save the data in a JSON attribute in the database, then an Encoding::UndefinedConversionError would be raised, as the encoder expects two more valid bytes after 0xE2.

In the case where the chosen parser is rexml, a REXML::ParseException would be raised when the parser is initiated.

To resolve the issue I am currently monkey patching your gem so that I can pass a middleware forcing the encoding of responses from excon, but I really hope that you patch this issue in the gem. Specifically, I added an encoding attribute to the prolog of the response xml so that ox can detect the encoding.

I also want to note that:

  1. The authors of the python counterpart of this gem are aware of the issue and have already handled it by forcing the encoding of the responses
  2. The author of excon is aware that the gem does not properly handle the encoding of the responses, but closed the issue where the problem is raised. His justification is that users can resolve the issue on their ends by using middlewares.

Here is a gist containing an example to illustrate the bug

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant