R-将交易格式数据集转换为篮子格式以进行市场篮子分析 [英] R-convert transaction format dataset to basket format for Market Basket Analysis

查看:38
本文介绍了R-将交易格式数据集转换为篮子格式以进行市场篮子分析的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

首先,我想澄清一下,在发布此查询之前,我参考了本网站中的以下链接以找到答案,但无法理解,可能是因为它们解决了不同的问题,或者因为我是 R 的新手.

R-将交易格式数据集转换为序列挖掘的篮子格式

R 中的 Arules 序列挖掘

如何处理参数"无法比拟的!= FALSE'(还没有)使用"?

我想用我的数据集进行市场篮子分析.我的数据集采用交易格式(如下所述),我想将其转换为篮子格式(如下所述).

我的输入文件是一个带有交易格式数据集的csv文件,如下所示:

TransactionID ProductID1A2乙 13一个 4乙 3

我希望我的输出文件是一个带有 Basket 格式的 csv 文件,如下所示:

1 2 41 33

其中 {1,2,4} 是在交易 A 中购买的产品,{1,3} 是在 B 中购买的产品,依此类推.

你能告诉我 R 代码来做到这一点吗?我尝试使用以下代码但它不起作用.我的输入文件名为D01_modified1.csv".

library(arulesSequences)# 将 CSV 读入 RMyData <- read.csv(file="D01_modified1.csv", header=TRUE, sep=",")s <- unique(MyData,incomparables = FALSE, fromLast = FALSE,paste0("ProductID"))# 在 R 中写入 CSVwrite.csv(s, file = "MyOutput.csv",row.names=FALSE, na="")

它给出了以下错误:

错误:参数 'incomparables != FALSE' 未使用(尚未)

此外,我不确定以下代码是否会给我所需的输出.

s <- unique(MyData,incomparables = FALSE, fromLast = FALSE,paste0("ProductID"))

请指导.期待您的帮助.非常感谢...

解决方案

它对我有用

df_fact <- data.frame(lapply(MyData,as.factor))df_trans <- as(df_fact, '交易')

希望有帮助.

First I would like to clarify that before posting this query, I have referred the following links in this site to find answer, but couldn't understand, maybe because they address different problems or because I am new to R.

R-convert transaction format dataset to basket format for sequence mining

Arules Sequence Mining in R

How to handle "argument 'incomparables != FALSE' is not used (yet)"?

I want to do Market Basket Analysis with my dataset. My dataset is in transaction format (as described below) and I want to convert it to Basket format (as described below).

My input file is a csv file with dataset in transaction format as follows:

TransactionID ProductID
A              1
A              2
B              1
C              3
A              4
B              3

I want my output file to be a csv file with Basket format as follows:

1 2 4
1 3
3

where {1,2,4} are products bought in transaction A, {1,3} bought in B and so on.

Can you please tell me the R code to do this? I tried with the following code but it is not working.My input file name is "D01_modified1.csv".

library(arulesSequences)
# Read CSV into R
MyData <- read.csv(file="D01_modified1.csv", header=TRUE, sep=",")
s <- unique(MyData,incomparables = FALSE, fromLast = FALSE,paste0("ProductID"))
# Write CSV in R
write.csv(s, file = "MyOutput.csv",row.names=FALSE, na="")

It is giving the following error :

Error: argument 'incomparables != FALSE' is not used (yet)

Also I am not sure if the following code will give me the desired output or not.

s <- unique(MyData,incomparables = FALSE, fromLast = FALSE,paste0("ProductID"))

Please guide. Looking forward to your help. Thanks a lot...

解决方案

It works for me

df_fact <- data.frame(lapply(MyData,as.factor))
df_trans <- as(df_fact, 'transactions')

Hope it helps.

这篇关于R-将交易格式数据集转换为篮子格式以进行市场篮子分析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆