斯坦福解析器java错误 [英] Stanford parser java error
问题描述
我正在研究NLP,我想用Stanford解析器从文本中提取名词短语,我使用的解析器版本是3.4.1
这是我使用的示例代码
I am working on a research about NLP, i woul to use Stanford parser to extract noun phrases from text, the parser version i used is 3.4.1 this is the sample code i used
package stanfordparser;
import java.util.Collection;
import java.util.List;
import java.io.StringReader;
import edu.stanford.nlp.process.Tokenizer;
import edu.stanford.nlp.process.TokenizerFactory;
import edu.stanford.nlp.process.CoreLabelTokenFactory;
import edu.stanford.nlp.process.DocumentPreprocessor;
import edu.stanford.nlp.process.PTBTokenizer;
import edu.stanford.nlp.ling.CoreLabel;
import edu.stanford.nlp.ling.HasWord;
import edu.stanford.nlp.ling.Sentence;
import edu.stanford.nlp.trees.*;
import edu.stanford.nlp.parser.lexparser.LexicalizedParser;
class ParserDemo {
public static void main(String[] args) {
LexicalizedParser lp = LexicalizedParser.loadModel("edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz");
if (args.length > 0) {
demoDP(lp, args[0]);
} else {
demoAPI(lp);
}
}
public static void demoDP(LexicalizedParser lp, String filename) {
TreebankLanguagePack tlp = new PennTreebankLanguagePack();
GrammaticalStructureFactory gsf = tlp.grammaticalStructureFactory();
for (List<HasWord> sentence : new DocumentPreprocessor(filename)) {
Tree parse = lp.apply(sentence);
parse.pennPrint();
System.out.println();
GrammaticalStructure gs = gsf.newGrammaticalStructure(parse);
Collection tdl = gs.typedDependenciesCCprocessed();
System.out.println(tdl);
System.out.println();
}
}
public static void demoAPI(LexicalizedParser lp) {
// This option shows parsing a list of correctly tokenized words
String[] sent = { "This", "is", "an", "easy", "sentence", "." };
List<CoreLabel> rawWords = Sentence.toCoreLabelList(sent);
Tree parse = lp.apply(rawWords);
parse.pennPrint();
System.out.println();
// This option shows loading and using an explicit tokenizer
String sent2 = "This is another sentence.";
TokenizerFactory<CoreLabel> tokenizerFactory =
PTBTokenizer.factory(new CoreLabelTokenFactory(), "");
Tokenizer<CoreLabel> tok =
tokenizerFactory.getTokenizer(new StringReader(sent2));
List<CoreLabel> rawWords2 = tok.tokenize();
parse = lp.apply(rawWords2);
TreebankLanguagePack tlp = new PennTreebankLanguagePack();
GrammaticalStructureFactory gsf = tlp.grammaticalStructureFactory();
GrammaticalStructure gs = gsf.newGrammaticalStructure(parse);
List<TypedDependency> tdl = gs.typedDependenciesCCprocessed();
System.out.println(tdl);
System.out.println();
// You can also use a TreePrint object to print trees and dependencies
TreePrint tp = new TreePrint("penn,typedDependenciesCollapsed");
tp.printTree(parse);
}
private ParserDemo() {} // static methods only
}
但是当我运行此代码时出现以下错误
but when i run this code i get the following error
java.io.IOException: Unable to resolve "edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz" as either class path, filename or URL
at edu.stanford.nlp.io.IOUtils.getInputStreamFromURLOrClasspathOrFileSystem(IOUtils.java:446)
at edu.stanford.nlp.io.IOUtils.readStreamFromString(IOUtils.java:380)
at edu.stanford.nlp.parser.lexparser.LexicalizedParser.getParserFromSerializedFile(LexicalizedParser.java:628)
at edu.stanford.nlp.parser.lexparser.LexicalizedParser.getParserFromFile(LexicalizedParser.java:423)
at edu.stanford.nlp.parser.lexparser.LexicalizedParser.loadModel(LexicalizedParser.java:182)
at edu.stanford.nlp.parser.lexparser.LexicalizedParser.loadModel(LexicalizedParser.java:161)
at stanfordparser.ParserDemo.main(ParserDemo.java:29)
我认为加载模型文件的问题,
可以任意一个帮我解决问题?
谢谢
I think the problem in the loading of the model file, Could any one help me to solve the problem? Thanks
更新:(1)我已经加入了cornlp模型罐
更新:(2)我正在使用Netbeans
UPDATE:(1) I am already includes the cornlp model jar
UPDATE:(2) I am using Netbeans
推荐答案
是的,你没有CoreNLP模型Jar。你可以从这里下载它们 - http://nlp.stanford.edu/software/corenlp.shtml#Download
Yes, You do not have CoreNLP models Jar. Either you can download them from here- http://nlp.stanford.edu/software/corenlp.shtml#Download
或者,你可以这样做:
- 创建一个Maven项目。 (在日食中很容易)
-
在pom.xml文件中,添加此依赖项。
- Create a Maven project. ( It is easy in eclipse)
In the pom.xml file, add this dependency.
<dependency>
<groupId>edu.stanford.nlp</groupId>
<artifactId>stanford-corenlp</artifactId>
<version>3.5.0</version>
</dependency>
<dependency>
<groupId>edu.stanford.nlp</groupId>
<artifactId>stanford-corenlp</artifactId>
<version>3.5.0</version>
<classifier>models</classifier>
</dependency>
执行maven clean,maven update和maven install。模型文件将自动安装在.m2文件夹中。
Do maven clean, maven update and maven install. The model files will be installed in your .m2 folder automatically.
我希望你知道maven。如果没有,请发表评论/问题。我们将回答。
I hope you know maven. If not, please post a comment / question. We will answer.
这篇关于斯坦福解析器java错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!