如何将data.frame转换为arules的事务 [英] how to convert data.frame to transactions for arules

查看:38
本文介绍了如何将data.frame转换为arules的事务的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我从一个csv文件中读取数据,数据有3列,一列是交易id,另外两列是产品和产品类别.我需要将其转换为事务,以便在 arules 中使用 apriori 函数.当我转换为交易时显示错误:

I read data from a csv file, the data has 3 columns, one is transaction id, the other two are product and product catagory. I need to convert this into transactions in order to use the apriori function in arules. It shows an error when I convert to transactions:

dat <- read.csv("spss.csv",head=TRUE,sep="," , as.is = T)
dat[,2] <- factor(dat[,2])
dat[,3] <- factor(dat[,3])
spssdat <- dat[,c(1,2,3)]
str(spssdat)

'data.frame':   108919 obs. of  3 variables:
 $ Transaction_id: int  3000312 3000312 3001972 3003361 3003361 3003361 3003361 3003361 3003361 3004637 ...
 $ product_catalog : Factor w/ 9 levels "AIM","BA","IM",..: 1 1 5 7 7 7 7 7 7 1 ...
 $ product      : Factor w/ 332 levels "ACM","ACTG/AIM",..: 7 7 159 61 61 61 61 61 61 7 ...

trans4 <- as(spssdat, "transactions")

Error in as(spssdat, "transactions") : 
  no method or default for coercing "data.frame" to "transactions"

如果数据只有两列,可以通过:

If the data only have two columns, it can work by:

trans4 <- as(split(spssdat[,2], spssdat[,1]), "transactions")

但是当我有 3 列时我不知道如何转换.通常还有额外的列,如类别属性、客户属性.所以该列通常大于 2 列.需要在多列之间找到规则.

But I don't know how to convert when I have 3 columns. Usually there are the additional columns likes category attributes, customer attributes. so the column usually large than 2 columns. need to find rules between multiple columns.

推荐答案

我在 本网站.让我复制相关段落:

I have found some information that worked for me on this website. Let me copy relevant paragraph:

数据帧可以是规范化(单个)形式或平面文件(篮子)形式.
当文件为篮子形式时,这意味着每条记录代表一个交易,其中篮子中的项目由列表示.
当数据集为单一形式时,意味着每条记录代表一个单独的项目,每个项目都包含一个交易ID.

The dataframe can be in either a normalized (single) form or a flat file (basket) form.
When the file is in basket form it means that each record represents a transaction where the items in the basket are represented by columns.
When the dataset is in single form it means that each record represents one single item and each item contains a transaction id.

要从文件加载事务,请使用 read.transactions.在你和我的案例中,文件都是单一形式.
我使用以下代码将 .csv 文件加载为 transactions:

To load transactions from file, use read.transactions. In both your and my case file is in the single form.
I've used following code to load .csv file as transactions:

trans = read.transactions("some_data.csv", format = "single", sep = ",", cols = c("transactionID", "productID"))

要完全理解上述命令,请查看 read.transactions 手册,在 R 控制台中键入 ?read.transactions 后可用.

To fully understand above command, take a look at read.transactions manual, available after typing ?read.transactions in R console.

这篇关于如何将data.frame转换为arules的事务的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆