Stanford NNDep解析器:java.lang.ArrayIndexOutOfBoundsException [英] Stanford NNDep parser: java.lang.ArrayIndexOutOfBoundsException

查看:90
本文介绍了Stanford NNDep解析器:java.lang.ArrayIndexOutOfBoundsException的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

训练完模型后,我试图解析测试树库.不幸的是,此错误不断弹出:

After training a model, i’m trying to parse the test treebank. Unfortunately, this error keeps popping up:

Loading depparse model file: nndep.model.txt.gz ...
###################
#Transitions: 77
#Labels: 38
ROOTLABEL: root
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 25
        at edu.stanford.nlp.parser.nndep.Classifier.preCompute(Classifier.java:663)
        at edu.stanford.nlp.parser.nndep.Classifier.preCompute(Classifier.java:637)
        at edu.stanford.nlp.parser.nndep.DependencyParser.initialize(DependencyParser.java:1151)
        at edu.stanford.nlp.parser.nndep.DependencyParser.loadModelFile(DependencyParser.java:589)
        at edu.stanford.nlp.parser.nndep.DependencyParser.loadModelFile(DependencyParser.java:493)
        at edu.stanford.nlp.parser.nndep.DependencyParser.main(DependencyParser.java:1245)

如果使用了NLP软件包随附的经过预训练的英语模型,则不会出现该错误.因此,训练后的模型可能有问题吗?但是,培训期间没有错误.完成了500次迭代(顺便说一下,我的2.33 GHz Core 2 Duo CPU @ 4 Gb RAM上的默认20000耗时超过15个小时,这样的时间是否正常?)训练,开发和测试集 UD 1.2 ;使用的单词嵌入是这些.似乎在使用非英语树库进行训练时(尝试使用瑞典语和波兰语UD;未使用UniversalEnglish设置-tlp选项)会发生此错误.

If the pre-trained english model, which ships with the NLP package, is used, that error does not appear. Therefore, there is maybe something wrong with the trained model? There were no errors during training, however. 500 iterations were done (default 20000 takes over 15 hours on my 2,33 GHz Core 2 Duo CPU @ 4 Gb RAM – is such an amount of time normal, by the way?) Train, dev and test sets are UD 1.2; word embeddings used are these. Seems that this error happens when non-english treebank is used for training (tried swedish and polish UD; -tlp option is not set, using UniversalEnglish).

推荐答案

回答我自己的问题,并在文档从来没有说过,实际上只是指与训练阶段,但是问题代码中的错误消息实际上以隐式方式暗示了错误的来源,并显示"25"̦,这是所使用的词嵌入的维数.

Answering my own question, with a hint in a comment by @Jon Gauthier. It turns out that the -embeddingSize flag is needed also at parsing stage if it was used during training (= other value then the default 50 was used). The documentation never says that, and in fact only refers to the flag in regards to the training phase, but the error message in the question code actually cryptically hints about the origin of the error, displaying „25"̦ which was the dimensionality of the word embeddings used.

这篇关于Stanford NNDep解析器:java.lang.ArrayIndexOutOfBoundsException的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆