适用于R的Wordnet分离器 [英] Wordnet Lemmatizer for R

查看:67
本文介绍了适用于R的Wordnet分离器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用wordnet lemmatizer对a

I would like to use the wordnet lemmatizer to lemmatize the words in a

> a<-c("He saw a see-saw on a sea shore", "she is feeling cold")
> a
[1] "He saw a see-saw on a sea shore" "she is feeling cold"  

我将a转换为语料库并进行预处理(例如停用词去除,词形去除等)

I convert a into a corpus and do pre-processing steps (like stopword removal, lemmatization etc)

> a <- Corpus(VectorSource(a))

我想通过以下方式进行lemmatization,

I wanted to do the lemmatization in the below way,

> filter <- getTermFilter("ExactMatchFilter", a, TRUE)
> terms <- getIndexTerms("NOUN", 1, filter)
> sapply(terms, getLemma)

但是我得到这个错误

> filter <- getTermFilter("ExactMatchFilter", a, TRUE)
Error in .jnew(paste("com.nexagis.jawbone.filter", type, sep = "."), word,  : 
  java.lang.NoSuchMethodError: <init>

我的想法是使整个语料库而不是单个词去词缀化,如何实现?

My idea is to lemmatize the whole corpus and not a single word, How can it be accomplished?

推荐答案

将代码循环放置,您可以尝试执行以下操作:

Put you code in a loop, you can try something like this:

       lapply(a,function(x){
            x.filter <- getTermFilter("ExactMatchFilter", x, TRUE))
            terms <- getIndexTerms("NOUN", 1, x.filter)
            sapply(terms, getLemma)
         })

这篇关于适用于R的Wordnet分离器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆