R tm 包 vcorpus:将语料库转换为数据框时出错 [英] R tm package vcorpus: Error in converting corpus to data frame

查看：47 发布时间：2021/9/8 20:08:05 r tm corpus

本文介绍了R tm 包 vcorpus:将语料库转换为数据框时出错的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用 tm 包使用以下代码清理一些数据:

I am using the tm package to clean up some data using the following code:

mycorpus <- Corpus(VectorSource(x))
mycorpus <- tm_map(mycorpus, removePunctuation)

然后我想将语料库转换回数据框，以便导出包含数据框原始格式数据的文本文件.我尝试了以下方法:

I then want to convert the corpus back into a data frame in order to export a text file that contains the data in the original format of a data frame. I have tried the following:

dataframe <- as.data.frame(mycorpus)

但这会返回一个错误:

as.data.frame.default.(mycorpus) 中的错误:无法将类c(vcorpus, > corpus")"强制转换为 data.frame

"Error in as.data.frame.default.(mycorpus) : cannot coerce class "c(vcorpus, > corpus")" to a data.frame

如何将语料库转换为数据框?

How can I convert a corpus into a data frame?

推荐答案

你的语料库实际上只是一个带有一些额外属性的字符向量.因此最好将其转换为字符，然后您可以将其保存到 data.frame 中，如下所示:

Your corpus is really just a character vector with some extra attributes. So it's best to convert it to character, then you can save that to a data.frame like so:

library(tm)
x <- c("Hello. Sir!","Tacos? On Tuesday?!?")
mycorpus <- Corpus(VectorSource(x))
mycorpus <- tm_map(mycorpus, removePunctuation)

dataframe <- data.frame(text=unlist(sapply(mycorpus, `[`, "content")), 
    stringsAsFactors=F)

哪个返回

              text
1        Hello Sir
2 Tacos On Tuesday

更新:使用较新版本的 tm，他们似乎更新了 as.list.SimpleCorpus 方法，该方法确实与使用 sapply 混淆和 lapply.现在我想你必须使用

UPDATE: With newer version of tm, they seem to have updated the as.list.SimpleCorpus method which really messes with using sapplyand lapply. Now I guess you'd have to use

dataframe <- data.frame(text=sapply(mycorpus, identity), 
    stringsAsFactors=F)

这篇关于R tm 包 vcorpus:将语料库转换为数据框时出错的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

R tm 包 vcorpus:将语料库转换为数据框时出错 [英] R tm package vcorpus: Error in converting corpus to data frame

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

R tm 包 vcorpus:将语料库转换为数据框时出错 [英] R tm package vcorpus: Error in converting corpus to data frame

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭