将数据(csv)放入规则篮中,是否删除重复项? [英] Data (csv) into basket for arules, removing duplicates?
问题描述
关于R,我是一个初学者.但是,我想了解更多.我正在尝试进行市场基准分析.
I'm a beginner when it comes to R. But, I want to learn more. I'm trying to do a market bench analysis.
这是我的原始数据,我想将其转换为交易记录格式:
This is my raw data and I want to convert this to a transactions basket format:
这是我要实现的目标:
我尝试过:
trans <- as(split(a[,"Game.played"],a[,"sessionid"]),"transactions")
但是,仅显示游戏编号,而不是游戏名称.谁能告诉我为什么会这样吗?另外,我已经交叉验证了实际数据,并且sessionid与游戏的关联是错误的!
But, instead of the name of the game, the number of the game is only displayed. Could anyone tell me why this is happening? Also, I have cross verifies the actual data, and the association of the sessionid with the game is wrong!
我也尝试过
q=read.transactions("a.csv", format = "basket", sep=",", rm.duplicates=TRUE).
但是,这也不起作用.
推荐答案
将数据放入规则篮中,删除重复项?
data into basket for arules, removing duplicates?
以下是有关如何删除重复项的示例:
Here's an example on how you could remove the duplicates:
set.seed(1)
df <- data.frame(
cat=rep(LETTERS[1:3], 2:4),
val=sample(letters[1:5], 9, T),
stringsAsFactors = FALSE
)
df
# cat val
# 1 A b
# 2 A b
# 3 B c
# 4 B e
# 5 B b
# 6 C e
# 7 C e
# 8 C d
# 9 C d
(lst <- lapply(split(df$val, df$cat), unique))
# $A
# [1] "b"
#
# $B
# [1] "c" "e" "b"
#
# $C
# [1] "e" "d"
library(arules)
as(lst, "transactions")
# transactions in sparse format with
# 3 transactions (rows) and
# 4 items (columns)
这篇关于将数据(csv)放入规则篮中,是否删除重复项?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!