CoreNLP斯坦福依赖性格式 [英] CoreNLP Stanford Dependency Format

查看:124
本文介绍了CoreNLP斯坦福依赖性格式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

布朗贝克参议员提交了港口和移民法案, 堪萨斯州共和党人

Bills on ports and immigration were submitted by Senator Brownback, Republican of Kansas

从上面的句子中,我希望获得以下类型的依赖项:

From the above sentence, I am looking to obtain the following typed dependencies:

nsubjpass(submitted, Bills)
auxpass(submitted, were)
agent(submitted, Brownback)
nn(Brownback, Senator)
appos(Brownback, Republican)
prep_of(Republican, Kansas)
prep_on(Bills, ports)
conj_and(ports, immigration)
prep_on(Bills, immigration)

根据使用以下代码,我只能实现以下依赖关系构成(代码输出此内容):

Using the below code I have only been able to achieve the following dependency makeup (code outputs this):

root(ROOT-0, submitted-7)
nmod:on(Bills-1, ports-3)
nmod:on(Bills-1, immigration-5)
case(ports-3, on-2)
cc(ports-3, and-4)
conj:and(ports-3, immigration-5)
nsubjpass(submitted-7, Bills-1)
auxpass(submitted-7, were-6)
nmod:agent(submitted-7, Brownback-10)
case(Brownback-10, by-8)
compound(Brownback-10, Senator-9)
punct(Brownback-10, ,-11)
appos(Brownback-10, Republican-12)
nmod:of(Republican-12, Kansas-14)
case(Kansas-14, of-13)

问题-如何实现上述期望的输出?

Question - How do I achieve the desired output above?

代码

public void processTestCoreNLP() {
    String text = "Bills on ports and immigration were submitted " +
            "by Senator Brownback, Republican of Kansas";

    Annotation annotation = new Annotation(text);
    Properties properties = PropertiesUtils.asProperties(
            "annotators", "tokenize,ssplit,pos,lemma,depparse"
    );

    AnnotationPipeline pipeline = new StanfordCoreNLP(properties);

    pipeline.annotate(annotation);

    for (CoreMap sentence : annotation.get(SentencesAnnotation.class)) {
        SemanticGraph sg = sentence.get(EnhancedPlusPlusDependenciesAnnotation.class);
        Collection<TypedDependency> dependencies = sg.typedDependencies();
        for (TypedDependency td : dependencies) {
            System.out.println(td);
        }
    }
}

推荐答案

如果您想通过NN依赖项解析器获取句子的CCprocessed和折叠的Stanford Dependencies(SD),则必须设置一个属性来规避CoreNLP中的一个小错误.

If you want to get the CCprocessed and collapsed Stanford Dependencies (SD) for a sentence through the NN dependency parser, you'll have to set a property to circumvent a small bug in CoreNLP.

但是,请注意,我们不再维护斯坦福依赖关系代码,除非您有充分的理由使用SD,否则我们建议将通用依赖关系用于任何新项目.看看通用依赖项(UD)文档

However, please note that we are no longer maintaining the Stanford Dependencies code and unless you have really good reasons to use SD, we'd recommend using Universal Dependencies for any new projects. Take a look at the Universal Dependencies (UD) documentation and Schuster and Manning (2016) for more information on the UD representation.

要获得CC处理并折叠的SD表示形式,请如下设置depparse.language属性:

To obtain the CCprocessed and collapsed SD representation, set the depparse.language property as follows:

public void processTestCoreNLP() {
  String text = "Bills on ports and immigration were submitted " +
        "by Senator Brownback, Republican of Kansas";

  Annotation annotation = new Annotation(text);
  Properties properties = PropertiesUtils.asProperties(
        "annotators", "tokenize,ssplit,pos,lemma,depparse");

  properties.setProperty("depparse.language", "English")

  AnnotationPipeline pipeline = new StanfordCoreNLP(properties);

  pipeline.annotate(annotation);

  for (CoreMap sentence : annotation.get(SentencesAnnotation.class)) {
    SemanticGraph sg = sentence.get(CollapsedCCProcessedDependenciesAnnotation.class);
    Collection<TypedDependency> dependencies = sg.typedDependencies();
    for (TypedDependency td : dependencies) {
      System.out.println(td);
    }
  }
}

这篇关于CoreNLP斯坦福依赖性格式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆