Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipeline SRL Verb Fails Weirdly #656

Closed
qiangning opened this issue May 30, 2018 · 22 comments · Fixed by #694
Closed

Pipeline SRL Verb Fails Weirdly #656

qiangning opened this issue May 30, 2018 · 22 comments · Fixed by #694
Assignees

Comments

@qiangning
Copy link
Member

I'm trying to use pipeline to process some raw text and output a TextAnnotation with verb SRL. This is kind of urgent since I'm deploying my package as an online demo and the demo paper is due on 6/1. Originally I was using curator, but the curator failed frequently (not always). Then I switched to pipeline but my pipeline always fails. Any suggestions to workaround this would be great. @danyaljj @mssammon @HornHehhf All I need isverb SRL.

I checkout the latest version of cogcomp-nlp. Here's the main function I use:

public static void main(String[] args) throws Exception{
        String text = "Helicopters patrol the temporary no-fly zone around New Jersey's MetLife Stadium Sunday, with F-16s based in Atlantic City ready to be scrambled if an unauthorized aircraft does enter the restricted airspace.";
        ResourceManager userConfig = new ResourceManager("pipeline/config/pipeline-config.properties");
        AnnotatorService pipeline = PipelineFactory.buildPipeline(userConfig);
        TextAnnotation ta = pipeline.createAnnotatedTextAnnotation( "", "", text );
        System.out.println();
    }

I got this error:

Connected to the target VM, address: '127.0.0.1:39575', transport: 'socket'
14:11:02 INFO  DepAnnotator:66 - Loading struct-perceptron-auto-20iter.model into temp file: tmp345673.model
14:11:03 INFO  SLModel:88 - Load trained Models.....
14:11:05 INFO  SLModel:97 - Load Model complete!
14:11:05 INFO  LabeledChuLiuEdmondsDecoder:72 - Loading cached PoS-to-dep dictionary from deprels.dict
14:11:06 ERROR BasicAnnotatorService:403 - The annotator for view SRL_VERB failed. Skipping the view . . . 
14:11:06 ERROR BasicAnnotatorService:403 - The annotator for view DEPENDENCY failed. Skipping the view . . . 
edu.illinois.cs.cogcomp.annotation.AnnotatorException: View 'NER_CONLL' cannot be provided by this AnnotatorService.
	at edu.illinois.cs.cogcomp.annotation.BasicAnnotatorService.addView(BasicAnnotatorService.java:308)
	at edu.illinois.cs.cogcomp.annotation.BasicAnnotatorService.addView(BasicAnnotatorService.java:313)
	at edu.illinois.cs.cogcomp.annotation.BasicAnnotatorService.addViewsAndCache(BasicAnnotatorService.java:400)
	at edu.illinois.cs.cogcomp.annotation.BasicAnnotatorService.createAnnotatedTextAnnotation(BasicAnnotatorService.java:378)
	at edu.illinois.cs.cogcomp.annotation.BasicAnnotatorService.createAnnotatedTextAnnotation(BasicAnnotatorService.java:193)
	at edu.illinois.cs.cogcomp.pipeline.main.test.main(test.java:12)
edu.illinois.cs.cogcomp.annotation.AnnotatorException: View 'SHALLOW_PARSE' cannot be provided by this AnnotatorService.
	at edu.illinois.cs.cogcomp.annotation.BasicAnnotatorService.addView(BasicAnnotatorService.java:308)
	at edu.illinois.cs.cogcomp.annotation.BasicAnnotatorService.addView(BasicAnnotatorService.java:313)
	at edu.illinois.cs.cogcomp.annotation.BasicAnnotatorService.addViewsAndCache(BasicAnnotatorService.java:400)
	at edu.illinois.cs.cogcomp.annotation.BasicAnnotatorService.createAnnotatedTextAnnotation(BasicAnnotatorService.java:378)
	at edu.illinois.cs.cogcomp.annotation.BasicAnnotatorService.createAnnotatedTextAnnotation(BasicAnnotatorService.java:193)
	at edu.illinois.cs.cogcomp.pipeline.main.test.main(test.java:12)

Disconnected from the target VM, address: '127.0.0.1:39575', transport: 'socket'

Process finished with exit code 0

@danyaljj
Copy link
Member

You didn't say what's the content of pipeline-config.properties.

@qiangning
Copy link
Member Author

It doesn't matter what I put in pipeline-config.properties. I first used the default pipeline-config.properties, which is all true, and then also the following pipeline-config.properties:

cacheDirectory  annotation-cache-test
throwExceptionIfNotCached   false
usePos	true
useDep	true
useLemma	true
useShallowParse	false
useNerConll	false
useNerOntonotes	false
useSrlVerb	true
useSrlNom	false
useCommaSrl	false
useQuantifier	false
useStanfordParse	false
useStanfordDep	false
useMention false
useRelation false

@qiangning
Copy link
Member Author

qiangning commented May 30, 2018

FYI, when I use curator to add verb SRL for the same sentence, I also got an error:

public static void main(String[] args) throws Exception{
        String text = "Helicopters patrol the temporary no-fly zone around New Jersey's MetLife Stadium Sunday, with F-16s based in Atlantic City ready to be scrambled if an unauthorized aircraft does enter the restricted airspace.";
        AnnotatorService annotator = CuratorFactory.buildCuratorClient();
        TextAnnotation ta = annotator.createBasicTextAnnotation("","", text);
        annotator.addView(ta, ViewNames.SRL_VERB);
        System.out.println();
    }
Connected to the target VM, address: '127.0.0.1:38537', transport: 'socket'
Disconnected from the target VM, address: '127.0.0.1:38537', transport: 'socket'
Exception in thread "main" edu.illinois.cs.cogcomp.annotation.AnnotatorException
	at edu.illinois.cs.cogcomp.curator.CuratorAnnotator.addView(CuratorAnnotator.java:55)
	at edu.illinois.cs.cogcomp.annotation.Annotator.lazyAddView(Annotator.java:203)
	at edu.illinois.cs.cogcomp.annotation.Annotator.getView(Annotator.java:167)
	at edu.illinois.cs.cogcomp.core.datastructures.textannotation.TextAnnotation.addView(TextAnnotation.java:109)
	at edu.illinois.cs.cogcomp.curator.CuratorAnnotatorService.addView(CuratorAnnotatorService.java:257)
	at edu.illinois.cs.cogcomp.curator.CuratorAnnotatorService.addView(CuratorAnnotatorService.java:255)
	at edu.illinois.cs.cogcomp.pipeline.main.test.main(test.java:17)

But curator can sometimes successfully process other sentences (e.g., "I like you." succeeds, but "He likes you." fails). Don't know yet when curator fails or succeeds.

@danyaljj
Copy link
Member

Try dropping the config file:

PipelineFactory.buildPipeline();

@qiangning
Copy link
Member Author

qiangning commented May 30, 2018

String text = "He likes you.";
AnnotatorService pipeline = PipelineFactory.buildPipeline();
TextAnnotation ta = pipeline.createAnnotatedTextAnnotation( "", "", text );

The above gives me only TOKENS and SENTENCE views.

I'm trying this: AnnotatorService pipeline = PipelineFactory.buildPipeline(ViewNames.POS,ViewNames.LEMMA,ViewNames.SHALLOW_PARSE,ViewNames.NER_CONLL,ViewNames.SRL_VERB);. Stay tuned for what comes out.

@danyaljj
Copy link
Member

PipelineFactory.buildPipeline(ViewNames.SRL_VERB) should just suffice.

@qiangning
Copy link
Member Author

It keeps asking me for new views. PipelineFactory.buildPipeline(ViewNames.SRL_VERB) says no POS, PipelineFactory.buildPipeline(ViewNames.POS,ViewNames.SRL_VERB) says no LEMMA... I added views according to the errors and ended up with AnnotatorService pipeline = PipelineFactory.buildPipeline(ViewNames.POS,ViewNames.LEMMA,ViewNames.PARSE_STANFORD,ViewNames.SHALLOW_PARSE,ViewNames.NER_CONLL,ViewNames.SRL_VERB);

Then I got this error: Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

I have been using -Xmx20g.

@schen149
Copy link
Contributor

I ran into similar problems a month ago. I think the VerbSRL also depends on PARSE_STANFORD view, so maybe try setting useStanfordParse to true in pipeline-config.properties?

Here are the required views for SemanticRoleLabeler (VerbSRL) class: [POS, NER_CONLL, LEMMA, SHALLOW_PARSE, PARSE_STANFORD]

@qiangning
Copy link
Member Author

@schen149 Exactly. You see I'm already using AnnotatorService pipeline = PipelineFactory.buildPipeline(ViewNames.POS,ViewNames.LEMMA,ViewNames.PARSE_STANFORD,ViewNames.SHALLOW_PARSE,ViewNames.NER_CONLL,ViewNames.SRL_VERB);

However, do you know why I'm receiving java.lang.OutOfMemoryError: Java heap space error although I'm using -Xmx20g?

@danyaljj
Copy link
Member

20g is not enough, unfortunately. You probably need around 25g.

@qiangning
Copy link
Member Author

FYI, the curator failed because of this error: ServiceUnavailableException(reason:chunk unavailable:java.net.ConnectException: Connection refused (Connection refused))

@qiangning
Copy link
Member Author

28g is still not enough...

@cogcomp-dev
Copy link

cogcomp-dev commented May 30, 2018 via email

@qiangning
Copy link
Member Author

qiangning commented May 30, 2018

@mssammon You know what, you saved my ass. Thanks very much! It works now. Now I see
image

Does that mean chunker is back? What would I see if chunker was offline?

@cogcomp-dev
Copy link

processid would be empty, whole line woudl be red

@qiangning
Copy link
Member Author

Thanks.

Still, please keep this issue open for a while. Let me try it on server to see if increasing the memory size really solves this problem.

@benman1
Copy link

benman1 commented Aug 3, 2018

I have the same problem. Even if I assign lots of memory, it's failing. I haven't managed to get it to start so far. I tried Xmx2M to Xmx64G. I get something like this every time:

OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x0000000715300000, 1147142144, 0) failed; error='Not enough space' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 1147142144 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /home/ben/prototyping/cogcomp-test/hs_err_pid9501.log

@mssammon
Copy link
Contributor

mssammon commented Aug 3, 2018

@qiangning , @benman1 are you launching from the command line? if so, are you running on a shared server? what is the memory available on the host server?

@qiangning
Copy link
Member Author

I was running from intellij on my local desktop, which has a 32G memory.

@mssammon
Copy link
Contributor

mssammon commented Aug 3, 2018

I'm not sure what mechanism Intellij uses for managing memory, but it is possible that intellij itself has only a limited amount of memory to allocate that puts a ceiling on what you can specify for child processes that it launches. You could try modifying the memory available to Intellij.

@mssammon
Copy link
Contributor

mssammon commented Aug 3, 2018

@qiangning qiangning self-assigned this Aug 11, 2018
@qiangning
Copy link
Member Author

I ran this on bronte, via the command line instead of intellij, and it works. Details are below.

The test code I use is this:

public class test {
    public static void main(String[] args) throws Exception{
        String text = "John likes you.";
        ResourceManager userConfig = new ResourceManager("config/pipeline-config.properties");
        AnnotatorService pipeline = PipelineFactory.buildPipeline(userConfig);
        TextAnnotation ta = pipeline.createAnnotatedTextAnnotation( "", "", text );
        System.out.println(ta.getAvailableViews());
    }
}

The config file is:

cacheDirectory  annotation-cache-test
throwExceptionIfNotCached   false
usePos	true
useDep	true
useLemma	true
useShallowParse	true
useNerConll	true
useNerOntonotes	false
useSrlVerb	true
useSrlNom	false
useCommaSrl	false
useQuantifier	false
useStanfordParse	true
useStanfordDep	false
useMention false
useRelation false

The output is:

[PARSE_STANFORD, SRL_VERB, NER_CONLL, LEMMA, TOKENS, SENTENCE, BROWN_CLUSTERS_1000, DEPENDENCY, CLAUSES_STANFORD, POS, SHALLOW_PARSE, DEPENDENCY_HEADFINDER:PARSE_STANFORD]

which looks good to me.

The problem happens if I'm not using our servers which often have more than 100G memories. When I created this issue, I was using my local desktop (linux; 32G memory), and it failed with the error message of java.lang.OutOfMemoryError: Java heap space.

Now I tried to use my mac (16G memory), and it failed with the error message of java.lang.OutOfMemoryError: GC overhead limit exceeded, which means the garbage collector is spending too much time on freeing memories without much work being done.

I think the conclusion is that memory consumption is so high that computers without a gigantic memory cannot handle it. I monitored the VIRT (virtual memory) while running the program on bronte. I found that the peak value of VIRT is about 36G.

I'd suggest a closure for this issue. However, we should

  • Make a special note in relevant readme files about this big memory consumption.
  • Investigate (in a separate issue) why SRL consumes so much memory. For example, I have been wondering why NER is needed in SRL? It seems to me that NER is taking much memory.

qiangning added a commit to qiangning/illinois-cogcomp-nlp that referenced this issue Sep 24, 2018
cogcomp-dev pushed a commit that referenced this issue Sep 27, 2018
readme update for memory usage of Verb SRL in pipeline (close #656); ChunkerTrain bug fixed and model updated (close #685)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants