Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kenlm State length mis-match #31

Open
ttpro1995 opened this issue Dec 14, 2017 · 3 comments
Open

kenlm State length mis-match #31

ttpro1995 opened this issue Dec 14, 2017 · 3 comments

Comments

@ttpro1995
Copy link

I replace newest kenlm (clone from their github) and build gradle compileKenLM

When I run step 2 with kenlm, read .online.stdout


Done loading phrase table: /data/20171214/config/dev.tables/phrase-table.gz (mem used: 71 MiB time: 0.253 s)
Longest foreign phrase: 5
Loading extended Moses Lexical Reordering Table: dev.tables/lo-hier.msd2-bidirectional-fe.gz
Done loading reordering table: dev.tables/lo-hier.msd2-bidirectional-fe.gz (mem used: 71 MiB time: 0.137s)
Hierarchical reordering model:
Distinguish between left and right discontinuous: true
Use containment orientation: false
Forward orientation: hierarchical
Backward orientation: hierarchical
Non-NPLM /data/trained_model/kenlm/20171124/20171124_lm_train_data.bin
[ERROR] 2017-12-14 16:46:32.184 [pool-2-thread-1] KenLMState - State length mis-match: 1 vs. 205
[ERROR] 2017-12-14 16:46:32.184 [pool-2-thread-3] KenLMState - State length mis-match: 1 vs. 205
[ERROR] 2017-12-14 16:46:32.184 [pool-2-thread-4] KenLMState - State length mis-match: 1 vs. 205
[ERROR] 2017-12-14 16:46:32.184 [pool-2-thread-2] KenLMState - State length mis-match: 1 vs. 205
java.lang.RuntimeException: Bad state length returned from KenLM query
	at edu.stanford.nlp.mt.lm.KenLMState.<init>(KenLMState.java:39)
	at edu.stanford.nlp.mt.lm.KenLanguageModel.score(KenLanguageModel.java:167)
	at edu.stanford.nlp.mt.decoder.feat.base.NGramLanguageModelFeaturizer.ruleFeaturize(NGramLanguageModelFeaturizer.java:162)
	at edu.stanford.nlp.mt.decoder.feat.FeatureExtractor.ruleFeaturize(FeatureExtractor.java:196)
	at edu.stanford.nlp.mt.tm.ConcreteRule.<init>(ConcreteRule.java:91)
	at edu.stanford.nlp.mt.tm.AbstractPhraseGenerator.getRules(AbstractPhraseGenerator.java:60)
	at edu.stanford.nlp.mt.tm.CombinedTranslationModel.getRules(CombinedTranslationModel.java:201)
	at edu.stanford.nlp.mt.decoder.AbstractBeamInferer.getRules(AbstractBeamInferer.java:115)
	at edu.stanford.nlp.mt.decoder.CubePruningDecoder.decode(CubePruningDecoder.java:130)
	at edu.stanford.nlp.mt.decoder.AbstractBeamInferer.nbest(AbstractBeamInferer.java:193)
	at edu.stanford.nlp.mt.decoder.AbstractBeamInferer.nbest(AbstractBeamInferer.java:95)
	at edu.stanford.nlp.mt.Phrasal.decode(Phrasal.java:1425)
	at edu.stanford.nlp.mt.tune.OnlineTuner$GradientProcessor.process(OnlineTuner.java:493)
	at edu.stanford.nlp.mt.tune.OnlineTuner$GradientProcessor.process(OnlineTuner.java:450)
	at edu.stanford.nlp.util.concurrent.MulticoreWrapper$CallableJob.call(MulticoreWrapper.java:255)
	at edu.stanford.nlp.util.concurrent.MulticoreWrapper$CallableJob.call(MulticoreWrapper.java:236)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
java.lang.RuntimeException: Bad state length returned from KenLM query
	at edu.stanford.nlp.mt.lm.KenLMState.<init>(KenLMState.java:39)
	at edu.stanford.nlp.mt.lm.KenLanguageModel.score(KenLanguageModel.java:167)
	at edu.stanford.nlp.mt.decoder.feat.base.NGramLanguageModelFeaturizer.ruleFeaturize(NGramLanguageModelFeaturizer.java:162)
	at edu.stanford.nlp.mt.decoder.feat.FeatureExtractor.ruleFeaturize(FeatureExtractor.java:196)
	at edu.stanford.nlp.mt.tm.ConcreteRule.<init>(ConcreteRule.java:91)
	at edu.stanford.nlp.mt.tm.AbstractPhraseGenerator.getRules(AbstractPhraseGenerator.java:60)
	at edu.stanford.nlp.mt.tm.CombinedTranslationModel.getRules(CombinedTranslationModel.java:201)
	at edu.stanford.nlp.mt.decoder.AbstractBeamInferer.getRules(AbstractBeamInferer.java:115)
	at edu.stanford.nlp.mt.decoder.CubePruningDecoder.decode(CubePruningDecoder.java:130)
	at edu.stanford.nlp.mt.decoder.AbstractBeamInferer.nbest(AbstractBeamInferer.java:193)
	at edu.stanford.nlp.mt.decoder.AbstractBeamInferer.nbest(AbstractBeamInferer.java:95)
	at edu.stanford.nlp.mt.Phrasal.decode(Phrasal.java:1425)
	at edu.stanford.nlp.mt.tune.OnlineTuner$GradientProcessor.process(OnlineTuner.java:493)
	at edu.stanford.nlp.mt.tune.OnlineTuner$GradientProcessor.process(OnlineTuner.java:450)
	at edu.stanford.nlp.util.concurrent.MulticoreWrapper$CallableJob.call(MulticoreWrapper.java:255)
	at edu.stanford.nlp.util.concurrent.MulticoreWrapper$CallableJob.call(MulticoreWrapper.java:236)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
java.lang.RuntimeException: Bad state length returned from KenLM query
	at edu.stanford.nlp.mt.lm.KenLMState.<init>(KenLMState.java:39)
	at edu.stanford.nlp.mt.lm.KenLanguageModel.score(KenLanguageModel.java:167)
	at edu.stanford.nlp.mt.decoder.feat.base.NGramLanguageModelFeaturizer.ruleFeaturize(NGramLanguageModelFeaturizer.java:162)
	at edu.stanford.nlp.mt.decoder.feat.FeatureExtractor.ruleFeaturize(FeatureExtractor.java:196)
	at edu.stanford.nlp.mt.tm.ConcreteRule.<init>(ConcreteRule.java:91)
	at edu.stanford.nlp.mt.tm.AbstractPhraseGenerator.getRules(AbstractPhraseGenerator.java:60)
	at edu.stanford.nlp.mt.tm.CombinedTranslationModel.getRules(CombinedTranslationModel.java:201)
	at edu.stanford.nlp.mt.decoder.AbstractBeamInferer.getRules(AbstractBeamInferer.java:115)
	at edu.stanford.nlp.mt.decoder.CubePruningDecoder.decode(CubePruningDecoder.java:130)
	at edu.stanford.nlp.mt.decoder.AbstractBeamInferer.nbest(AbstractBeamInferer.java:193)
	at edu.stanford.nlp.mt.decoder.AbstractBeamInferer.nbest(AbstractBeamInferer.java:95)
	at edu.stanford.nlp.mt.Phrasal.decode(Phrasal.java:1425)
	at edu.stanford.nlp.mt.tune.OnlineTuner$GradientProcessor.process(OnlineTuner.java:493)
	at edu.stanford.nlp.mt.tune.OnlineTuner$GradientProcessor.process(OnlineTuner.java:450)
	at edu.stanford.nlp.util.concurrent.MulticoreWrapper$CallableJob.call(MulticoreWrapper.java:255)
	at edu.stanford.nlp.util.concurrent.MulticoreWrapper$CallableJob.call(MulticoreWrapper.java:236)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
java.lang.RuntimeException: Bad state length returned from KenLM query
	at edu.stanford.nlp.mt.lm.KenLMState.<init>(KenLMState.java:39)
	at edu.stanford.nlp.mt.lm.KenLanguageModel.score(KenLanguageModel.java:167)
	at edu.stanford.nlp.mt.decoder.feat.base.NGramLanguageModelFeaturizer.ruleFeaturize(NGramLanguageModelFeaturizer.java:162)
	at edu.stanford.nlp.mt.decoder.feat.FeatureExtractor.ruleFeaturize(FeatureExtractor.java:196)
	at edu.stanford.nlp.mt.tm.ConcreteRule.<init>(ConcreteRule.java:91)
	at edu.stanford.nlp.mt.tm.AbstractPhraseGenerator.getRules(AbstractPhraseGenerator.java:60)
	at edu.stanford.nlp.mt.tm.CombinedTranslationModel.getRules(CombinedTranslationModel.java:201)
	at edu.stanford.nlp.mt.decoder.AbstractBeamInferer.getRules(AbstractBeamInferer.java:115)
	at edu.stanford.nlp.mt.decoder.CubePruningDecoder.decode(CubePruningDecoder.java:130)
	at edu.stanford.nlp.mt.decoder.AbstractBeamInferer.nbest(AbstractBeamInferer.java:193)
	at edu.stanford.nlp.mt.decoder.AbstractBeamInferer.nbest(AbstractBeamInferer.java:95)
	at edu.stanford.nlp.mt.Phrasal.decode(Phrasal.java:1425)
	at edu.stanford.nlp.mt.tune.OnlineTuner$GradientProcessor.process(OnlineTuner.java:493)
	at edu.stanford.nlp.mt.tune.OnlineTuner$GradientProcessor.process(OnlineTuner.java:450)
	at edu.stanford.nlp.util.concurrent.MulticoreWrapper$CallableJob.call(MulticoreWrapper.java:255)
	at edu.stanford.nlp.util.concurrent.MulticoreWrapper$CallableJob.call(MulticoreWrapper.java:236)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

@ttpro1995
Copy link
Author

ttpro1995 commented Dec 14, 2017

run with java loader, it seem to woking fine

Done loading phrase table: /data/20171214/config/zing.tables/phrase-table.gz (mem used: 71 MiB time: 0.190 s)
Longest foreign phrase: 5
Loading extended Moses Lexical Reordering Table: dev.tables/lo-hier.msd2-bidirectional-fe.gz
Done loading reordering table: dev.tables/lo-hier.msd2-bidirectional-fe.gz (mem used: 71 MiB time: 0.120s)
Hierarchical reordering model:
Distinguish between left and right discontinuous: true
Use containment orientation: false
Forward orientation: hierarchical
Backward orientation: hierarchical
Reading 262144 1-grams...
Reading 8388608 2-grams...
Reading 67108864 3-grams...

@pdhung3012
Copy link

Hello. What do you mean java loader. Do you still run the phrasal.sh?

@ttpro1995
Copy link
Author

ttpro1995 commented Mar 9, 2019

you pick loader in .init
Look at example here
https://github.com/stanfordnlp/phrasal/blob/be69585b62d75d2bf1092bd534a4c5602ea34b63/example/example.ini (line 14-17)

# The 'kenlm:' enables the KenLM loader. Remove the
# prefix for the standard Java ARPA loader.
[lmodel-file]
kenlm:/home/me/phrasal.fr-en/4gm.bin

So, if you want java loader, remove the "kenlm"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants