用 NLTK 检查英语语法 [英] Checking English Grammar with NLTK

查看:33
本文介绍了用 NLTK 检查英语语法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我开始使用NLTK库,我想检查一个英文句子是否正确与否.

I'm starting to use the NLTK library, and I want to check whether a sentence in English is correct or not.

示例:

他看到鲍勃" - 不正确

"He see Bob" - not correct

他看到鲍勃" - 正确

"He sees Bob" - correct

我阅读了这个,但这对我来说很难.我需要一个更简单的例子.

I read this, but it's quite hard for me. I need an easier example.

推荐答案

语法检查是 NLP 研究的一个活跃领域,因此目前还没有 100% 的答案(甚至可能不是 80% 的答案).最简单的方法(或至少是合理的基线)是 n-gram 语言模型(将 LM 概率归一化为话语长度,并为语法"或不语法"设置启发式阈值.

Grammar checking is an active area of NLP research, so there isn't a 100% answer (maybe not even an 80% answer) at this time. The simplest approach (or at least a reasonable baseline) would be an n-gram language model (normalizing LM probabilities for utterance length and setting a heuristic threshold for 'grammatical' or 'ungrammatical'.

您可以使用 Google 的 n-gram 语料库,也可以使用域内数据训练自己的语料库.你也许可以用 NLTK 做到这一点;您绝对可以使用 LingPipe、SRI 语言建模工具包或 OpenGRM.

You could use Google's n-gram corpus, or train your own on in-domain data. You might be able to do that with NLTK; you definitely could with LingPipe, the SRI Language Modeling Toolkit, or OpenGRM.

也就是说,n-gram 模型不会表现得那么好.如果它满足您的需求,那就太好了,但是如果您想做得更好,则必须训练机器学习分类器.语法分类器通常会使用句法和/或语义处理中的特征(例如 POS 标签、依存关系和选区解析等).您可能会查看 Joel Tetrault 和他在 ETS 工作的团队或 Jennifer 的一些工作福斯特和她在都柏林的团队.

That said, an n-gram model won't perform all that well. If it meets your needs, great, but if you want to do better, you'll have to train a machine-learning classifier. A grammaticality classifier would generally use features from syntactic and/or semantic processing (e.g. POS-tags, dependency and constituency parses, etc.) You might look at some of the work from Joel Tetrault and the team he worked with at ETS, or Jennifer Foster and her team at Dublin.

抱歉,没有一个简单直接的答案...

Sorry there isn't an easy and straightforward answer...

这篇关于用 NLTK 检查英语语法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆