使用斯坦福解析器检查句子在语法上是否正确 [英] checking if a sentence is grammatically correct using stanford parser

查看:90
本文介绍了使用斯坦福解析器检查句子在语法上是否正确的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有任何方法可以使用斯坦福解析器检查句子在语法上是否正确?截至目前,已经能够使用斯坦福解析器来获取句子的解析树.我被困在这里,不知道如何继续.

解决方案

larsmans是正确的,因为这些解析器不是为此目的而设计的,但是这里有一个技巧:

您可以尝试使用解析器信心".每个概率解析器都会计算不同标签的概率,并分配最可能的序列.我已经尝试过使用语音标记器( http://www.ark.cs. cmu.edu/TweetNLP/),其中为每个标签分配了一定的置信度(0.93、0.45等),我计算了句子中所有标签的平均置信度,并将其与某个置信度阈值进行比较(基于语料库中的其他句子.

很显然,如果标签的置信度不够高,我认为该句子在语法上是不正确的.经过一些启发式操作(例如照顾标点符号或一个单词的句子)后,它对我有用.

斯坦福解析器具有概率性,可以肯定地计算概率,但是我无法确定其可信度.也许您必须深入研究一下,看看如何将其公开.

Is there any method to check if a sentence is grammatically correct or not using stanford parser? As of now am able to get the parse tree of a sentence using stanford parser. I got stuck here and don't know how to proceed further.

解决方案

larsmans is right that those parsers are not designed for that, but here is a hack:

You can try using the parser "confidence". Each probabilistic parser calculates probabilities of different tags and assigns the most probable sequence. I've tried this with a part of speech tagger (http://www.ark.cs.cmu.edu/TweetNLP/), where each tag is assigned with some confidence (0.93, 0.45, etc.), I calculate the average confidence of all tags in a sentence and compare it to some confidence threshold (based on other sentences in the corpus).

Obviously if the confidence of the tags is not high enough I assume the sentence is grammatically incorrect. After some more heuristics - like taking care of punctuation or one-word sentences - it worked for me.

Stanford parser is probabilistic and calculates probabilities for sure but I couldn't get it the confidence of the box. Perhaps you'll have to dig in and see how you can expose it.

这篇关于使用斯坦福解析器检查句子在语法上是否正确的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆