R-Project 没有适用于“元"的方法应用于类“字符"的对象; [英] R-Project no applicable method for 'meta' applied to an object of class "character"

查看:67
本文介绍了R-Project 没有适用于“元"的方法应用于类“字符"的对象;的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试运行此代码(Ubuntu 12.04、R 3.1.1)

I am trying to run this code (Ubuntu 12.04, R 3.1.1)

# Load requisite packages
library(tm)
library(ggplot2)
library(lsa)

# Place Enron email snippets into a single vector.
text <- c(
  "To Mr. Ken Lay, I’m writing to urge you to donate the millions of dollars you made from selling Enron stock before the company declared bankruptcy.",
  "while you netted well over a $100 million, many of Enron's employees were financially devastated when the company declared bankruptcy and their retirement plans were wiped out",
  "you sold $101 million worth of Enron stock while aggressively urging the company’s employees to keep buying it",
  "This is a reminder of Enron’s Email retention policy. The Email retention policy provides as follows . . .",
  "Furthermore, it is against policy to store Email outside of your Outlook Mailbox and/or your Public Folders. Please do not copy Email onto floppy disks, zip disks, CDs or the network.",
  "Based on our receipt of various subpoenas, we will be preserving your past and future email. Please be prudent in the circulation of email relating to your work and activities.",
  "We have recognized over $550 million of fair value gains on stocks via our swaps with Raptor.",
  "The Raptor accounting treatment looks questionable. a. Enron booked a $500 million gain from equity derivatives from a related party.",
  "In the third quarter we have a $250 million problem with Raptor 3 if we don’t "enhance" the capital structure of Raptor 3 to commit more ENE shares.")
view <- factor(rep(c("view 1", "view 2", "view 3"), each = 3))
df <- data.frame(text, view, stringsAsFactors = FALSE)

# Prepare mini-Enron corpus
corpus <- Corpus(VectorSource(df$text))
corpus <- tm_map(corpus, tolower)
corpus <- tm_map(corpus, removePunctuation)
corpus <- tm_map(corpus, function(x) removeWords(x, stopwords("english")))
corpus <- tm_map(corpus, stemDocument, language = "english")
corpus # check corpus

# Mini-Enron corpus with 9 text documents

# Compute a term-document matrix that contains occurrance of terms in each email
# Compute distance between pairs of documents and scale the multidimentional semantic space (MDS) onto two dimensions
td.mat <- as.matrix(TermDocumentMatrix(corpus))
dist.mat <- dist(t(as.matrix(td.mat)))
dist.mat  # check distance matrix

# Compute distance between pairs of documents and scale the multidimentional semantic space onto two dimensions
fit <- cmdscale(dist.mat, eig = TRUE, k = 2)
points <- data.frame(x = fit$points[, 1], y = fit$points[, 2])
ggplot(points, aes(x = x, y = y)) + geom_point(data = points, aes(x = x, y = y, color = df$view)) + geom_text(data = points, aes(x = x, y = y - 0.2, label = row.names(df)))

但是,当我运行它时,我收到此错误(在 td.mat <-as.matrix(TermDocumentMatrix(corpus)) line):

However, when I run it I get this error (in the td.mat <- as.matrix(TermDocumentMatrix(corpus)) line):

Error in UseMethod("meta", x) : 
  no applicable method for 'meta' applied to an object of class "character"
In addition: Warning message:
In mclapply(unname(content(x)), termFreq, control) :
  all scheduled cores encountered errors in user code

我不知道该看什么 - 所有模块都已加载.

I am not sure what to look at - all modules loaded.

推荐答案

最新版本的 tm (0.60) 使您无法使用 tm_map 的函数不再对简单的字符值进行操作.所以问题在于您的 tolower 步骤,因为这不是规范"转换(请参阅 getTransformations()).只需将其替换为

The latest version of tm (0.60) made it so you can't use functions with tm_map that operate on simple character values any more. So the problem is your tolower step since that isn't a "canonical" transformation (See getTransformations()). Just replace it with

corpus <- tm_map(corpus, content_transformer(tolower))

content_transformer 函数包装器会将所有内容转换为语料库中正确的数据类型.您可以将 content_transformer 与任何旨在操作字符向量的函数一起使用,以便它可以在 tm_map 管道中工作.

The content_transformer function wrapper will convert everything to the correct data type within the corpus. You can use content_transformer with any function that is intended to manipulate character vectors so that it will work in a tm_map pipeline.

这篇关于R-Project 没有适用于“元"的方法应用于类“字符"的对象;的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆