将数据(csv)放入规则篮中,是否删除重复项? [英] Data (csv) into basket for arules, removing duplicates?

查看:100
本文介绍了将数据(csv)放入规则篮中,是否删除重复项?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

关于R,我是一个初学者.但是,我想了解更多.我正在尝试进行市场基准分析.

I'm a beginner when it comes to R. But, I want to learn more. I'm trying to do a market bench analysis.

这是我的原始数据,我想将其转换为交易记录格式:

This is my raw data and I want to convert this to a transactions basket format:

这是我要实现的目标:

我尝试过:

trans <- as(split(a[,"Game.played"],a[,"sessionid"]),"transactions")

但是,仅显示游戏编号,而不是游戏名称.谁能告诉我为什么会这样吗?另外,我已经交叉验证了实际数据,并且sessionid与游戏的关联是错误的!

But, instead of the name of the game, the number of the game is only displayed. Could anyone tell me why this is happening? Also, I have cross verifies the actual data, and the association of the sessionid with the game is wrong!

我也尝试过

q=read.transactions("a.csv", format = "basket", sep=",", rm.duplicates=TRUE). 

但是,这也不起作用.

推荐答案

将数据放入规则篮中,删除重复项?

data into basket for arules, removing duplicates?

以下是有关如何删除重复项的示例:

Here's an example on how you could remove the duplicates:

set.seed(1)
df <- data.frame(
  cat=rep(LETTERS[1:3], 2:4), 
  val=sample(letters[1:5], 9, T),
  stringsAsFactors = FALSE
)
df
#   cat val
# 1   A   b
# 2   A   b
# 3   B   c
# 4   B   e
# 5   B   b
# 6   C   e
# 7   C   e
# 8   C   d
# 9   C   d
(lst <- lapply(split(df$val, df$cat), unique))
# $A
# [1] "b"
# 
# $B
# [1] "c" "e" "b"
# 
# $C
# [1] "e" "d"
library(arules)
as(lst, "transactions")
# transactions in sparse format with
#  3 transactions (rows) and
#  4 items (columns)

这篇关于将数据(csv)放入规则篮中,是否删除重复项?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆