适用于R的Wordnet分离器 [英] Wordnet Lemmatizer for R
本文介绍了适用于R的Wordnet分离器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我想使用wordnet
lemmatizer对a
I would like to use the wordnet
lemmatizer to lemmatize the words in a
> a<-c("He saw a see-saw on a sea shore", "she is feeling cold")
> a
[1] "He saw a see-saw on a sea shore" "she is feeling cold"
我将a
转换为语料库并进行预处理(例如停用词去除,词形去除等)
I convert a
into a corpus and do pre-processing steps (like stopword removal, lemmatization etc)
> a <- Corpus(VectorSource(a))
我想通过以下方式进行lemmatization,
I wanted to do the lemmatization in the below way,
> filter <- getTermFilter("ExactMatchFilter", a, TRUE)
> terms <- getIndexTerms("NOUN", 1, filter)
> sapply(terms, getLemma)
但是我得到这个错误
> filter <- getTermFilter("ExactMatchFilter", a, TRUE)
Error in .jnew(paste("com.nexagis.jawbone.filter", type, sep = "."), word, :
java.lang.NoSuchMethodError: <init>
我的想法是使整个语料库而不是单个词去词缀化,如何实现?
My idea is to lemmatize the whole corpus and not a single word, How can it be accomplished?
推荐答案
将代码循环放置,您可以尝试执行以下操作:
Put you code in a loop, you can try something like this:
lapply(a,function(x){
x.filter <- getTermFilter("ExactMatchFilter", x, TRUE))
terms <- getIndexTerms("NOUN", 1, x.filter)
sapply(terms, getLemma)
})
这篇关于适用于R的Wordnet分离器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文