使用R中的arules包对重复交易进行关联分析 [英] Association analysis with duplicate transactions using arules package in R

查看:142
本文介绍了使用R中的arules包对重复交易进行关联分析的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想创建篮子格式的交易对象,可以随时调用进行分析.数据包含带有1001个事务的逗号分隔的项目.前10个交易如下​​所示:

I want to create a transaction object in basket format which I can call anytime for my analyses. The data contains comma separated items with 1001 transactions. The first 10 transactions look like this:

hering,corned_b,olives,ham,turkey,bourbon,ice_crea
baguette,soda,hering,cracker,heineken,olives,corned_b
avocado,cracker,artichok,heineken,ham,turkey,sardines
olives,bourbon,coke,turkey,ice_crea,ham,peppers
hering,corned_b,apples,olives,steak,avocado,turkey
sardines,heineken,chicken,coke,ice_crea,peppers,ham
olives,bourbon,coke,turkey,ice_crea,heineken,apples
corned_b,peppers,bourbon,cracker,chicken,ice_crea,baguette
soda,olives,bourbon,cracker,heineken,peppers,baguette
corned_b,peppers,bourbon,cracker,chicken,bordeaux,hering
...

我发现数据中存在重复的事务,并删除了它们,但是每次尝试读取事务时,都会得到:

I observed that there are duplicated transactions in the data and removed them but each time I tried to read the transactions, I get:

asMethod(object)中的错误: 无法强制处理重复项目的交易列表

Error in asMethod(object) : can not coerce list with transactions with duplicated items

这是我的代码:

data <- read.csv("AssociationsItemList.txt",header=F)
data <-  data[!duplicated(data),]
pop <- NULL
for(i in 1:length(data)){
pop <- paste(pop, data[i],sep="\n")
}
write(pop, file = "Trans", sep = ",")
transdata <- read.transactions("Trans", format = "basket", sep=",")

我确定我错过了一些重要的事情.请提供您的帮助.

I'm sure there's something little yet important I've missed. Kindly offer your assistance.

推荐答案

问题不在于重复的事​​务(同一行出现两次) 但重复的商品(同一商品在同一笔交易中出现两次) 例如第4行上的橄榄".

The problem is not with duplicated transactions (the same row appearing twice) but duplicated items (the same item appearing twice, in the same transaction -- e.g., "olives" on line 4).

read.transactions具有一个rm.duplicates参数来删除这些重复项.

read.transactions has an rm.duplicates argument to remove those duplicates.

read.transactions("Trans", format = "basket", sep=",", rm.duplicates=TRUE)

这篇关于使用R中的arules包对重复交易进行关联分析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆