使用斯坦福解析器解析中文 [英] Using stanford parser to parse Chinese

查看:196
本文介绍了使用斯坦福解析器解析中文的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我的代码,大部分来自演示.该程序运行完美,但是结果是非常错误的.它没有说出这些话. 谢谢

here is my code, mostly from the demo. The program runs perfectly, but the result is very wrong. It did not spilt the words. Thank you

public static void main(String[] args) {
 LexicalizedParser lp = LexicalizedParser.loadModel("edu/stanford/nlp/models/lexparser/xinhuaFactored.ser.gz");

  demoAPI(lp);

}


public static void demoAPI(LexicalizedParser lp) {


// This option shows loading and using an explicit tokenizer
String sent2 = "我爱你";
TokenizerFactory<CoreLabel> tokenizerFactory =
    PTBTokenizer.factory(new CoreLabelTokenFactory(), "");
Tokenizer<CoreLabel> tok =
    tokenizerFactory.getTokenizer(new StringReader(sent2));
List<CoreLabel> rawWords2 = tok.tokenize();

Tree parse = lp.apply(rawWords2);

TreebankLanguagePack tlp = new PennTreebankLanguagePack();
GrammaticalStructureFactory gsf = tlp.grammaticalStructureFactory();
GrammaticalStructure gs = gsf.newGrammaticalStructure(parse);
List<TypedDependency> tdl = gs.typedDependenciesCCprocessed();
System.out.println(tdl);
System.out.println();

// You can also use a TreePrint object to print trees and dependencies
TreePrint tp = new TreePrint("penn,typedDependenciesCollapsed");
tp.printTree(parse);
}

推荐答案

您确定要分割单词吗?例如,尝试使用我爱你"再次运行它.作为句子.我相信解析器会从命令行自动进行分段,但是我不确定它在Java中的作用.

Did you make sure to segment the words? For example try running it again with "我 爱 你." as the sentence. I believe from the command line the parser will segment automatically, however I'm not sure what it does from within Java.

这篇关于使用斯坦福解析器解析中文的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆