使用“TermDocumentMatrix"时出错和“Dist"R中的函数 [英] Error using "TermDocumentMatrix" and "Dist" functions in R
问题描述
我一直在尝试复制示例 此处:但我在此过程中遇到了一些问题.
直到这里一切正常:
docsTDM <- TermDocumentMatrix(docs8)
<块引用>
UseMethod("meta", x) 中的错误:没有适用于元"的方法应用于字符"类的对象
另外:警告信息:
在 mclapply(unname(content(x)), termFreq, control) 中:
所有调度的内核都遇到了用户代码中的错误
所以我能够通过改变这个来修正这个错误:
docs8 <- tm_map(docs7, tolower)
为此:
docs8 <- tm_map(docs7, content_transformer(tolower))
但后来我又遇到了麻烦:
docsdissim <- dissimilarity(docsTDM, method = "cosine")
<块引用>
错误:找不到函数不同"
然后我了解到dissimilarity"函数被替换为dist
函数,所以我这样做了:
docsdissim <- dist(docsTDM, method = "cosine")
<块引用>
crossprod(x, y)/sqrt(crossprod(x) * crossprod(y)) 中的错误:不一致的数组
这就是我卡住的地方.
顺便说一下,我的 R 版本是:
<块引用>R 版本 3.2.2 (2015-08-14) 在 CentOS 7 上运行
change
docsdissim <- proxy::dist(docsTDM, method = "cosine")
到
docsdissim <- dist(as.matrix(docsTDM), method = "cosine")
dist
需要一个数字矩阵、数据框或dist"对象和事件作为输入,尽管 termdocumentmatrix 是一个矩阵,但需要在此处进行转换.
I have been trying to replicate the example here: but I have had some problems along the way.
Everything worked fine until here:
docsTDM <- TermDocumentMatrix(docs8)
Error in UseMethod("meta", x) : no applicable method for 'meta' applied to an object of class "character"
In addition: Warning message:
In mclapply(unname(content(x)), termFreq, control) :
all scheduled cores encountered errors in user code
So I was able to fix that error modifying this previous step by changing this:
docs8 <- tm_map(docs7, tolower)
To this:
docs8 <- tm_map(docs7, content_transformer(tolower))
But then I got in trouble again with:
docsdissim <- dissimilarity(docsTDM, method = "cosine")
Error: could not find function "dissimilarity"
Then I learned that the "dissimilarity" function was replaced by the dist
function, so I did:
docsdissim <- dist(docsTDM, method = "cosine")
Error in crossprod(x, y)/sqrt(crossprod(x) * crossprod(y)) : non-conformable arrays
And there is where I'm stuck.
By the way, my R version is :
R version 3.2.2 (2015-08-14) running on CentOS 7
change
docsdissim <- proxy::dist(docsTDM, method = "cosine")
to
docsdissim <- dist(as.matrix(docsTDM), method = "cosine")
dist
requires as input a numeric matrix, data frame or "dist" object and event though a termdocumentmatrix is a matrix, it needs to be transformed here.
这篇关于使用“TermDocumentMatrix"时出错和“Dist"R中的函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!