Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nokogiri::XML::SAX::PushParser incompatibility with JRuby in Nokogiri 1.5.0 #615

Closed
tucker250 opened this issue Feb 13, 2012 · 5 comments
Closed

Comments

@tucker250
Copy link

There is a bug in Nokogiri 1.5.0 with the Nokogiri::XML::SAX::PushParser that causes this exception under JRuby v1.6.6, 1.6.5, and 1.6.4. It does not break under vanilla ruby, nor under JRuby with Nokogiri 1.4.7. Here is a sample script that reproduces the problem: https://gist.github.com/1819696

ArrayList.java:547:in `RangeCheck': java.lang.IndexOutOfBoundsException: Index: 16, Size: 16
    from ArrayList.java:322:in `get'
    from PushInputStream.java:372:in `put'
    from PushInputStream.java:94:in `write'
    from PushInputStream.java:105:in `writeAndWaitForRead'
    from XmlSaxPushParser.java:124:in `native_write'
    from XmlSaxPushParser$i$2$0$native_write.gen:65535:in `call'
    from CachingCallSite.java:332:in `cacheAndCall'
    from CachingCallSite.java:203:in `call'
    from FCallTwoArgNode.java:38:in `interpret'
    from NewlineNode.java:104:in `interpret'
    from ASTInterpreter.java:75:in `INTERPRET_METHOD'
    from InterpretedMethod.java:190:in `call'
    from DefaultMethod.java:199:in `call'
    from AliasMethod.java:61:in `call'
    from CachingCallSite.java:312:in `cacheAndCall'
    from CachingCallSite.java:169:in `call'
    from ShiftLeftCallSite.java:24:in `call'
    from swipely/tmp/nokogiri-broken.rb:7:in `__file__'
    from swipely/tmp/nokogiri-broken.rb:-1:in `load'
    from Ruby.java:695:in `runScript'
    from Ruby.java:688:in `runScript'
    from Ruby.java:595:in `runNormally'
    from Ruby.java:444:in `runFromMain'
    from Main.java:344:in `doRunFromMain'
    from Main.java:256:in `internalRun'
    from Main.java:222:in `run'
    from Main.java:206:in `run'
    from Main.java:186:in `main'

One example of user impact is that this breaks the implementation of AWS S3 access in the latest version of Fog (v1.1.2).

@yokolet
Copy link
Member

yokolet commented Feb 20, 2012

Thanks for reporting. It seems sax push parser has many bugs. Not only this. I'll fix it.

@nirvdrum
Copy link

To add a bit more context, this is still an issue on JRuby 1.6.7 with Nokogiri 1.5.3rc2. It appears the parser isn't creating as many Blocks as there are Segments. I'll try to take a crack at it, but I'm having difficulty building the gem at the moment (working that out on the user list).

@nirvdrum
Copy link

Oh, and I'm trying to work up a test case, but I'm seeing this with fog, which makes pretty heavy use of the SAX parser. I just can't expose my AWS keys for the example.

@yokolet
Copy link
Member

yokolet commented Apr 11, 2012

I pushed the change yesterday.

I rewrote push parser widely, and stopped using PushInputStream. Now, I don't get the exception anymore.

However, the gist never returns. This is because pure Java version uses a thread to keep waiting upcoming inputs. The thread stops by calling finish method or getting an exception. So, the gist, https://gist.github.com/2360551 , terminates running the thread and exits.

I'm thinking to add some time out feature, but not now.

Does this behavior work for you?

@yokolet
Copy link
Member

yokolet commented Jun 13, 2012

I'm going to close this issue since new push parser doesn't raise exception for a given example.

If still there's a issue, please reopen that.

@yokolet yokolet closed this as completed Jun 13, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants