Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jenkins runs fail on invalid XML: "Invalid byte 2 of 4-byte UTF-8 sequence" #16

Closed
apjanke opened this issue Apr 4, 2016 · 10 comments
Closed
Assignees
Labels
bug Reproducible Homebrew/brew bug

Comments

@apjanke
Copy link
Contributor

apjanke commented Apr 4, 2016

Some Jenkins build runs will fail completely without output like this. No test results or artifacts will be available.

Failed to read test report file /Users/brew/Jenkins/prs/version/yosemite/brew-test-bot.xml
org.dom4j.DocumentException: Invalid byte 2 of 4-byte UTF-8 sequence. Nested exception: Invalid byte 2 of 4-byte UTF-8 sequence.
    at org.dom4j.io.SAXReader.read(SAXReader.java:484)
    at org.dom4j.io.SAXReader.read(SAXReader.java:264)
    at hudson.tasks.junit.SuiteResult.parse(SuiteResult.java:123)
    at hudson.tasks.junit.TestResult.parse(TestResult.java:286)
    at hudson.tasks.junit.TestResult.parsePossiblyEmpty(TestResult.java:232)
    at hudson.tasks.junit.TestResult.parse(TestResult.java:167)

The test output is all included in an XML file, which includes the output of some of the steps. That output, from commands we call, may not be valid XML CDATA, or even valid UTF-8, since we're calling external commands.

The test-bot command attempts to filter out invalid characters but it's apparently not being thorough enough. We need more robust filtering or escaping.

I think the approach in Homebrew/legacy-homebrew#50092 was pretty solid, but needed different syntax, like so.

output = step.output.gsub(/[^\x09\x0A\x0D\x20-\uD7FF\uE000-\uFFFD\u{10000}-\u{10FFFF}]/, "")

And it might be better to replace filtered-out characters with a Unicode "replacement character" placeholder instead of just eliding. Then there'd still be some indication that there was other output there.

References

@apjanke
Copy link
Contributor Author

apjanke commented Apr 4, 2016

Looks like there are a couple other issues besides the pattern for valid XML characters.

The output from the test steps is buffered one byte at a time, because the IO streams for the IO.pipe are encoded as BINARY.

      begin
        pid = fork do
          read.close
          $stdout.reopen(write)
          $stderr.reopen(write)
          write.close
          working_dir.cd { exec(*@command) }
        end
        write.close
        while buf = read.read(1)
          if verbose
            print buf
            $stdout.flush
          end
          @output << buf
        end
      ensure
        read.close
      end

Since buf is a BINARY aka ASCII-8BIT stream, I think that means each individual byte will be coerced to a UTF-8 character, not code unit, when it's appended to @output, which is initialized with @output = "" and thus has UTF-8 encoding. So multibyte UTF-8 characters will be transcoded wrong. The output should be buffered as binary and only converted when fully collected (or converted in chunks with support for "remainders").

Then later on, long output is truncated. At this point, it's a UTF-8 string, so slice() takes offsets in character counts. But the bounds are calculated in terms of bytes, so they may be off, and it's possible that output with lots of multibyte characters could cause the offsets to fall outside the bounds of the string, erroneously producing nil instead of a substring. Or if there are lots of multibyte characters, the resulting truncated string could still be significantly longer than 1 MB, which could be a problem if that 1 MB is a hard limit.

            if output.bytesize > BYTES_IN_1_MEGABYTE
              output = "truncated output to 1MB:\n" \
                + output.slice(-BYTES_IN_1_MEGABYTE, BYTES_IN_1_MEGABYTE)
            end

Working on fixing those up as part of this.

@apjanke
Copy link
Contributor Author

apjanke commented Apr 4, 2016

I think I have a fix for this – https://github.com/apjanke/brew/commit/eb8db209c3a79145be49191978ef2328e602abee. But I don't know how to write a unit test for it, since test-bot is already a testing scaffold, and (I think) is dependent on the state of your local git repos.

Seems to work when I run it locally on my 10.9 and 10.11 boxes. And I found why the issue might be occurring for this case: one of the file names in the akka distribution has spaces and an oddball multibyte character in its name: akka-2.4.2/lib/akka/i have sp+�ces.jar. Looks like it's there to exercise encoding and escaping issues just like this.

When I look at the diff in the output between the current code and my patch, it shows up as different. But I'm having trouble pulling out exactly the difference, because the text editors and stuff I'm working with are probably doing their own invalid-character-replacement stuff.

4784c4784
<   inflating: akka-2.4.2/lib/akka/i have sp+�ces.jar
---
>   inflating: akka-2.4.2/lib/akka/i have sp+�ces.jar

Next up: finding a simple command-line XML validator that I can pass the "before" and "after" brew-test-bot.xml output to, to see if validation is fixed. Preferably one using the same Java XML libraries that Jenkins does. Seems like I ought to know one, but my XML-fu all involves developing big Java programs.

@apjanke
Copy link
Contributor Author

apjanke commented Apr 4, 2016

Oh, ha ha: it's just xmllint, included with OS X.

It likes my changes.

[~/tmp/test-bot]
$ xmllint --noout brew-test-bot-orig.xml
brew-test-bot-orig.xml:4784: parser error : Input is not proper UTF-8, indicate encoding !
Bytes: 0xF1 0x63 0x65 0x73
  inflating: akka-2.4.2/lib/akka/i have sp+�ces.jar
                                           ^
[✘ ~/tmp/test-bot]
$ xmllint --noout brew-test-bot-after.xml
[~/tmp/test-bot]
$

@MikeMcQuaid
Copy link
Member

@apjanke Nice work on this. When you're happy with it I think it's fine to just push to master for the test-bot.rb to run and just keep an eye on it for a few jobs.

@apjanke
Copy link
Contributor Author

apjanke commented Apr 4, 2016

Thank you!

If @xu-cheng is awake, would like to hear his feedback before pushing, since he knows test-bot.

@xu-cheng
Copy link
Member

xu-cheng commented Apr 4, 2016

I think you can push it. Looks good to me.

@apjanke
Copy link
Contributor Author

apjanke commented Apr 4, 2016

Thanks! I'm about to go to bed. I'll push it in the morning when I'll be around to monitor the test jobs for a while.

@apjanke
Copy link
Contributor Author

apjanke commented Apr 5, 2016

Pushed it. I'll be watching the test-bot for the next few hours, and running some additional builds to exercise it.

@UniqMartin
Copy link
Contributor

👏 Thanks for not only tackling this issue, but also for documenting the process so nicely! Reading this issue was very instructive and insightful.

@apjanke apjanke added the bug Reproducible Homebrew/brew bug label Apr 5, 2016
@apjanke apjanke self-assigned this Apr 5, 2016
@apjanke
Copy link
Contributor Author

apjanke commented Apr 5, 2016

I pushed the revised version. Since then, some Jenkins runs have been failing with "output not available", but they're Jenkins-level errors saying "GitHub is down for maintenance", so I'm pretty sure it's a coincidence. Will keep an eye on it.

@Homebrew Homebrew locked and limited conversation to collaborators May 3, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Reproducible Homebrew/brew bug
Projects
None yet
Development

No branches or pull requests

4 participants