在R中使用TM软件包的VCorpus时遇到错误 [英] Error faced while using TM package's VCorpus in R

查看:144
本文介绍了在R中使用TM软件包的VCorpus时遇到错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在使用R处理TM软件包时,我遇到以下错误.

I am facing the below error while working on the TM package with R.

library("tm")
Loading required package: NLP
Warning messages:
1: package ‘tm’ was built under R version 3.4.2 
2: package ‘NLP’ was built under R version 3.4.1 

corpus <- VCorpus(DataframeSource(data))

错误:all(!is.na(match(c("doc_id","text"),names(x))))不正确

Error: all(!is.na(match(c("doc_id", "text"), names(x)))) is not TRUE

尝试了多种方法,例如重新安装软件包,使用R的新版本进行更新,但错误仍然存​​在.对于相同的数据文件,相同的代码在具有相同R版本的另一个系统上运行.

Have tried various ways like reinstalling the package, updating with new version of R but the error still persists. For the same data file the same code runs on another system with the same version of R.

推荐答案

我将tm软件包更新为0.7-2版本时遇到了同样的问题. 我查找了DataframeSource()的详细信息,它提到了:

I met the same problem when I updated the tm package to 0.7-2 version. I looked for details of DataframeSource(), it mentioned:

第一列必须命名为"doc_id",并且每个文档均包含唯一的字符串标识符.第二列必须命名为文本".

The first column must be named "doc_id" and contain a unique string identifier for each document. The second column must be named "text".

详细信息

数据帧源将数据帧x的每一行解释为一个文档.第一列必须命名为"doc_id",并且每个文档均包含唯一的字符串标识符.第二列必须命名为文本",并包含代表文档内容的"UTF-8"编码字符串.可选的其他列用作文档级元数据.

A data frame source interprets each row of the data frame x as a document. The first column must be named "doc_id" and contain a unique string identifier for each document. The second column must be named "text" and contain a "UTF-8" encoded string representing the document's content. Optional additional columns are used as document level metadata.

我用以下代码解决了它:

I solved it with the following code:

df_cmp<- read.csv("test_file.csv",stringsAsFactors = F)

df_title <- data.frame(doc_id=row.names(df_cmp),
                       text=df_cmp$English.title)

您可以尝试将列名称更改为doc_idtext.

You can try and change the column names to doc_id and text.

这篇关于在R中使用TM软件包的VCorpus时遇到错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆