从解析树中获取某些节点 [英] Get certain nodes out of a Parse Tree
问题描述
我正在研究一个涉及通过Hobbs算法进行回指解析的项目。我使用斯坦福解析器解析了我的文本,现在我想操作节点以实现我的算法。
I am working on a project involving anaphora resolution via Hobbs algorithm. I have parsed my text using the Stanford parser, and now I would like to manipulate the nodes in order to implement my algorithm.
此刻,我不明白如何:
-
根据POS标签访问节点(例如,我需要以代词开头 - 我如何获得所有代词?)。
Access a node based on its POS tag (e.g. I need to start with a pronoun - how do I get all pronouns?).
使用访客。我有点像Java的菜鸟,但是在C ++中我需要实现一个Visitor functor然后处理它的钩子。我找不到Stanford Parser的Tree结构。那是jgrapht吗?如果是的话,你可以在代码片段中提供一些指示吗?
Use visitors. I'm a bit of a noob of Java, but in C++ I needed to implement a Visitor functor and then work on its hooks. I could not find much for the Stanford Parser's Tree structure though. Is that jgrapht? If it is, could you provide me with some pointers at code snippets?
推荐答案
@ dhg的答案很好,但是这里有两个其他选项可能也有用了解:
@dhg's answer works fine, but here are two other options that it might also be useful to know about:
-
树
类实现Iterable
。您可以在预订遍历中迭代遍历树
的所有节点,或者严格地说,按每个节点为首的子树:
The
Tree
class implementsIterable
. You can iterate through all the nodes of aTree
, or, strictly, the subtrees headed by each node, in a pre-order traversal, with:
for (Tree subtree : t) {
if (subtree.label().value().equals("PRP")) {
pronouns.add(subtree);
}
}
您还可以获得满足的节点一些(可能非常复杂的模式)使用 tregex
,它的行为类似于 java.util.regex
,允许模式匹配在树上。你会有类似的东西:
You can also get just nodes that satisfy some (potentially quite complex pattern) by using tregex
, which behaves rather like java.util.regex
by allowing pattern matches over trees. You would have something like:
TregexPattern tgrepPattern = TregexPattern.compile("PRP");
TregexMatcher m = tgrepPattern.matcher(t);
while (m.find()) {
Tree subtree = m.getMatch();
pronouns.add(subtree);
}
这篇关于从解析树中获取某些节点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!