是否有C#实用程序用于匹配(语法分析)树中的模式? [英] Is there a C# utility for matching patterns in (syntactic parse) trees?

查看:78
本文介绍了是否有C#实用程序用于匹配(语法分析)树中的模式?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在一个自然语言处理(NLP)项目中,在该项目中,我使用语法分析器从给定的句子中创建语法分析树.

I'm working on a Natural Language Processing (NLP) project in which I use a syntactic parser to create a syntactic parse tree out of a given sentence.

示例输入: 我遇到了Joe和Jill,然后我们去购物了
示例输出: [TOP [S [S [S [NP [PRP I]]] [VP [VBD ran] [PP [IN into]] [NP [NNP Joe] [CC和] [NNP吉尔]]]]] [CC和] [S [ADVP [RB然后]] [NP [PRP我们]] [VP [VBD去了] [NP [NN购物]]]]]]]]

Example Input: I ran into Joe and Jill and then we went shopping
Example Output: [TOP [S [S [NP [PRP I]] [VP [VBD ran] [PP [IN into] [NP [NNP Joe] [CC and] [NNP Jill]]]]] [CC and] [S [ADVP [RB then]] [NP [PRP we]] [VP [VBD went] [NP [NN shopping]]]]]]

我正在寻找一个C#实用程序,它将使我能够执行类似以下的复杂查询:

I'm looking for a C# utility that will let me do complex queries like:

  • 获取与"Joe"相关的第一个VBD
  • 获取最接近购物"的NP

这是一个 Java实用程序,我正在寻找C#等效项.
任何帮助将不胜感激.

Here's a Java utility that does this, I'm looking for a C# equivalent.
Any help would be much appreciated.

推荐答案

我们已经使用

一种选择是将输出解析为C#代码,然后将其编码为XML,这样就可以节点放入中间的string.Format("<{0}>", this.Name);string.Format("</{0}>", this._name);中,将所有子节点递归放置.

One option would be to parse the output into C# code and then encoding it to XML making every node into string.Format("<{0}>", this.Name); and string.Format("</{0}>", this._name); in the middle put all the child nodes recursively.

完成此操作后,我将使用查询XML/HTML的工具来解析树.成千上万的人已经使用查询选择器和jQuery基于节点之间的关系来解析树状结构.我认为这远远优于TRegex或其他过时且未维护的Java实用程序.

After you do this, I would use a tool for querying XML/HTML to parse the tree. Thousands of people already use query selectors and jQuery to parse tree-like structure based on the relation between nodes. I think this is far superior to TRegex or other outdated and un-maintained java utilities.

例如,这是为了回答您的第一个示例:

For example, this is to answer your first example:

var xml = CQ.Create(d.ToXml());
//this can be simpler with CSS selectors but I chose Linq since you'll probably find it easier
//Find joe, in our case the node that has the text 'Joe'
var joe = xml["*"].First(x => x.InnerHTML.Equals("Joe")); 
//Find the last (deepest) element that answers the critiria that it has "Joe" in it, and has a VBD in it
//in our case the VP
var closestToVbd = xml["*"].Last(x => x.Cq().Has(joe).Has("VBD").Any());
Console.WriteLine("Closest node to VPD:\n " +closestToVbd.OuterHTML);
//If we want the VBD itself we can just find the VBD in that element
Console.WriteLine("\n\n VBD itself is " + closestToVbd.Cq().Find("VBD")[0].OuterHTML);

这是您的第二个例子

//Now for NP closest to 'Shopping', find the element with the text 'shopping' and find it's closest NP
var closest = xml["*"].First(x =>     x.InnerHTML.Equals("shopping")).Cq()
                      .Closest("NP")[0].OuterHTML;
Console.WriteLine("\n\n NP closest to shopping is: " + closest);

这篇关于是否有C#实用程序用于匹配(语法分析)树中的模式?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆