R tm 在 mclapply(content(x), FUN, ...) 中:所有已调度的内核在用户代码中都遇到错误 [英] R tm In mclapply(content(x), FUN, ...) : all scheduled cores encountered errors in user code
问题描述
当我在倒数第二行运行以下代码时,我收到警告消息:
When I run the following codes to the penultimate line, I got Warning message:
在 mclapply(content(x), FUN, ...) 中:遇到的所有调度内核用户代码错误
In mclapply(content(x), FUN, ...) : all scheduled cores encountered errors in user code
当我运行最后一行时,我得到了
When I run the final line, I got
"使用方法错误(\"words\") : \n 没有适用的方法'words' 应用于类 \"character\"\n" attr(,"class") 的对象"尝试错误" attr(,"条件")
"Error in UseMethod(\"words\") : \n no applicable method for 'words' applied to an object of class \"character\"\n" attr(,"class") "try-error" attr(,"condition")
以下链接是一个可复制的示例,我们可以将其复制/粘贴到 R 中并运行.
The following link is a reproducible example which we can copy/paste into R and run.
https://github.com/weijia2013/mclapply-issue/blob/主/代码
我刚开始学习 R 语言,非常感谢您的帮助.
I just start learn R and I'll be appreciate your help.
library(devtools)
install_github("twitteR", username="geoffjentry")
library(twitteR)
setup_twitter_oauth("API Key", "API Secret")
rdmTweets <- userTimeline('rdatamining', n=200)
(nDocs <- length(rdmTweets))
rdmTweets[11:15]
for (i in 11:15) {cat(paste("[[", i, "]] ", sep="")) + writeLines(strwrap(rdmTweets[[i]]$getText(), width=73))}
df <- do.call("rbind", lapply(rdmTweets, as.data.frame))
dim(df)
library(tm)
library(SnowballC)
library(RWeka)
library(rJava)
library(RWekajars)
myCorpus <- Corpus(VectorSource(df$text))
myCorpus <- tm_map(myCorpus, tolower)
myCorpus <- tm_map(myCorpus, removePunctuation)
myCorpus <- tm_map(myCorpus, removeNumbers)
removeURL <- function(x) gsub("http[[:alnum:]]*", "", x)
myCorpus <- tm_map(myCorpus, removeURL)
myStopwords <- c(stopwords("english"), "available", "via")
myStopwords <- setdiff(myStopwords, c("r", "big"))
myCorpus <- tm_map(myCorpus, removeWords, myStopwords)
myCorpusCopy <- myCorpus
myCorpus <- tm_map(myCorpus, stemDocument)
for (i in 11:15) {cat(paste("[[", i, "]] ", sep="")) + writeLines(strwrap(myCorpus[[i]], width=73))}
myCorpus <- tm_map(myCorpus, stemCompletion, dictionary=myCorpusCopy)
inspect(myCorpus[11:15])
sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-apple-darwin13.1.0 (64-bit)
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods
[7] base
other attached packages:
[1] RWekajars_3.7.11-1 rJava_0.9-6 RWeka_0.4-23
[4] SnowballC_0.5 tm_0.6 NLP_0.1-3
[7] twitteR_1.1.8 devtools_1.5
loaded via a namespace (and not attached):
[1] bit_1.1-12 bit64_0.9-4 digest_0.6.4 evaluate_0.5.5
[5] grid_3.1.1 httr_0.4 memoise_0.2.1 parallel_3.1.1
[9] RCurl_1.95-4.1 rjson_0.2.14 slam_0.1-32 stringr_0.6.2
[13] tools_3.1.1 whisker_0.3-2
推荐答案
尝试使用如下代码:
myCorpus <- tm_map(myCorpus, stemDocument,lazy=TRUE).
myCorpus <- tm_map(myCorpus, tolower,lazy=TRUE) etc.
我认为新的 tm 包明确要求这样做.
I think the new tm package requires this explicitly.
这篇关于R tm 在 mclapply(content(x), FUN, ...) 中:所有已调度的内核在用户代码中都遇到错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!