R中的否定处理,如何替换R中的否定词? [英] negation handling in R, how can I replace a word following a negation in R?

查看:181
本文介绍了R中的否定处理,如何替换R中的否定词?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在对金融文章进行情绪分析.为了提高我的朴素贝叶斯分类器的准确性,我想实施求反处理.

I'm doing sentiment analysis for financial articles. To enhance the accuracy of my naive Bayes classifier, I'd like to implement negation handling.

具体来说,我想在"not"或"n't"之后的单词上添加前缀"not _"

Specifically, I want to add the prefix "not_" to the word following a "not" or "n't"

所以如果我的语料库中有这样的内容:

So if there's something like this in my corpus:

 x <- "They didn't sell the company." 

我想得到以下内容:

"they didn't not_sell the company."

(停用词"did n't"将在以后删除)

(the stopword "didn't" will be removed later)

我只能找到gsub()函数,但似乎不适用于该任务.

I could find only the gsub() function, but it doesn't seem to work for this task.

任何帮助将不胜感激!!谢谢!

Any help would be appreciated!! Thank you!

推荐答案

具体来说,我想在"a"后面的单词上添加前缀"not_" 不是"或不是"

Specifically, I want to add the prefix "not_" to the word following a "not" or "n't"

str_negate <- function(x) {
  gsub("not ","not not_",gsub("n't ","n't not_",x))
}

或者我想您可以使用strsplit:

Or I suppose you could use strsplit:

str_negate <- function(x) {
  str_split <- unlist(strsplit(x=x, split=" "))
  is_negative <- grepl("not|n't",str_split,ignore.case=T)
  negate_me <- append(FALSE,is_negative)[1:length(str_split)]
  str_split[negate_me==T]<- paste0("not_",str_split[negate_me==T])
  paste(str_split,collapse=" ")
}

这两种方法都可以给您:

either way gives you:

> str_negate("They didn't sell the company")
[1] "They didn't not_sell the company"
> str_negate("They did not sell the company")
[1] "They did not not_sell the company"

这篇关于R中的否定处理,如何替换R中的否定词?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆