将功能值设置为UIMA Ruta中包含注释的数量 [英] Setting feature value to the count of containing annotation in UIMA Ruta

查看:82
本文介绍了将功能值设置为UIMA Ruta中包含注释的数量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个RUTA脚本,其中所有句子都用Sentence注释进行注释,各种单词和短语都用它们自己的特定注释进行注释.一切都按预期进行.

I've got a RUTA script where all the sentences have been annotated with a Sentence annotation and various words and phrases have been annotated with their own specific annotations. That all works as expected.

这些注释中的每个注释都具有针对包含该注释的句子的索引的功能.因此,在一个人为的示例中并给出了文字

Each one of those annotations has a feature for the index of the sentence that contains it. So in a contrived example and given the text

杰克和吉尔上山了.杰克摔倒了.

Jack and Jill went up the hill. Jack fell down.

我有一个"down"注释,我想将句子索引设置为2,表示它在第二句中.我在想类似以下的内容,尽管我知道这是不正确的.

I have a "down" annotation that I want set the sentence index to 2, indicating that it is in the second sentence. I'm thinking something like the following although I know that's not correct.

Sentence{CONTAINS(Down) -> Down.sentence_index = index

其中索引是句子的索引. RUTA有可能吗?如果是这样,什么是合适的脚本.我可以在一个单独的分析引擎中完成此操作,并且过去已经做到了,但是我希望用ruta脚本替换其中一些.

where the index is the index of the the sentence. Is this possible with RUTA? If so, what's the appropriate script. I can do this in a separate analysis engine and have done so in the past, but I'm hoping to replace some of that with ruta scripts.

谢谢

尼克

推荐答案

在UIMA Ruta中有几种表达方式.我的第一个猜测是:

There are several ways to express this in UIMA Ruta. My first guess would be something like:

// just to have an executable example
DECLARE Sentence;
DECLARE Annotation Down (INT sentence_index);
((# PERIOD){-> Sentence})+;
"down" -> Down;

// the acutal rule with a helper variable
INT index;
Sentence{CONTAINS(Down), CURRENTCOUNT(Sentence, index)} -> 
   {Down{-> Down.sentence_index = index};};

该规则在所有包含Down注释的句子上匹配.此外,CURRENTCOUNT将Sentence注释计数到匹配的位置,并将值存储在变量索引中.然后,一个内联规则(由第一个->"指示)与匹配语句中的所有Down注释匹配,并将变量的值分配给匹配的Down注释的特征.根据要从0还是1开始,需要增加分配的值:

The rule matches on all sentences that contain a Down annotation. Additionally, CURRENTCOUNT counts the Sentence annotations upto the matched position and stores the values in the variable index. Then, an inlined rule (indicated by the first "->") matches on all Down annotations within the matched sentence and assigns the value of the variable to the feature of the matched Down annotation. Depending if you want to start with 0 or 1, you need to increment the assigned value:

... Down.sentence_index = (index+1)};};

条件CURRENTCOUNT也可以接受最小值和最大值,以便像真实条件一样工作.它真的很旧,所以我不知道它如何缩放大型文档.

The condition CURRENTCOUNT can also accept an min and max value in order to act like a real condition. It is realy old, so I don't know how it scales for large documents.

这是另一个示例,但是这次没有CURRENTCOUNT条件,而是将索引存储在Sentence批注中

Here's another example, but this time without the CURRENTCOUNT condition and for storing the index in the Sentence annotation:

DECLARE Annotation Sentence (INT index);
DECLARE Annotation Down (INT sentence_index);
INT index;

(# PERIOD){-> Sentence, ASSIGN(index, (index + 1)), Sentence.index = index};
PERIOD (# PERIOD){-> Sentence, ASSIGN(index, (index + 1)), Sentence.index = index};
"down" -> Down;

Sentence{CONTAINS(Down) -> ASSIGN(index, Sentence.index)} 
  ->  {Down{-> Down.sentence_index = index};};

请记住,第一个示例中的用于创建句子注释的规则不能使用,因为它仅使用一个规则匹配,并且其操作将应用于匹配的片段.第二个示例中的规则导致许多规则匹配,因此在处理下一个规则匹配之前应用操作.在不同匹配范围的feautre值之间进行复制并不是很好,但这可能会在某个时候得到改善.

Mind that the rule for creating Sentence annotations in the first example cannot be used since it uses only one rule match and its actions are applied on the matched fragments. The rule in the second example results in many rule matches and thus applies the actions before the next rule match is processed. The copying between feautre values of different matching scopes is not really nice, but that will maybe be improved sometime.

如果您已经有了Sentence注释,则可以为索引分配以下内容:

If you have already Sentence annotations, you can assign the index with something like:

Sentence{-> ASSIGN(index, (index + 1)), Sentence.index = index};

示例已使用UIMA Ruta 2.2.1-SNAPSHOT进行了测试.

Examples have been tested with UIMA Ruta 2.2.1-SNAPSHOT.

(我是UIMA Ruta的开发人员)

这篇关于将功能值设置为UIMA Ruta中包含注释的数量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆