Apriori算法-频繁项目集生成 [英] Apriori Algorithm- frequent item set generation

查看:148
本文介绍了Apriori算法-频繁项目集生成的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Apriori算法来识别客户的常用商品集.基于已标识的常用商品集,我想在客户将新商品添加到购物清单时向客户提示建议商品,作为常用商品集,我得到的结果如下;

I am using Apriori algorithm to identify the frequent item sets of the customer.Based on the identified frequent item sets I want to prompt suggest items to customer when customer adds a new item to his shopping list, As the frequent item sets I got the result as follows;

[1],[3],[2],[5]
[2.3],[3,5],[1,3],[2,5]
[2,3,5]

我的问题是,如果我仅考虑设置[2,3,5]来向客户提出建议,那我错了吗?例如,如果客户将项目3添加到他的购物清单中,我会推荐项目2和项目5.如果客户将项目1添加到购物清单中,则不会提出任何建议,因为我考虑的是仅设置[2,3,5]而项目1是在该集合中不可用.我想知道我的逻辑(仅考虑集合[2,3,5])是否足以为用户提供建议

My problem is if I consider only [2,3,5] set to make suggestions to customer am I wrong? i.e If customer adds item 3 to his shopping list I would recommend item 2 and item 5. If customer adds item 1 to the shopping list no suggestions will be made since I am considering only set [2,3,5] and item 1 is not available in that set. I want to know whether my logic (considering only set [2,3,5]) is enough to make suggestions for the user

推荐答案

否.推导推荐规则需要更多的努力.

No. Deriving recommendation rules requires more effort.

仅仅因为[2,3,5]很频繁,不是并不意味着2-> 3,5是一个很好的规则.

Just because [2,3,5] is frequent does not mean 2 -> 3,5 is a good rule.

考虑2是一个非常受欢迎的产品,但是3,5只是很少使用.考虑加油站. [天然气,咖啡,百吉饼]可能是经常使用的商品,但是购买天然气的顾客很少,他们也会购买咖啡和百吉饼(低置信度).

Consider the case that 2 is a very popular product, but 3,5 are just barely frequent. Consider a gas station. [gas, coffee, bagel] is probably a frequent itemset, but rather few customers who buy gas will also buy coffee and a bagel (low confidence).

要做要考虑诸如2,3-> 5之类的规则,因为它们可能具有更高的置信度. IE.如果客户购买汽油和咖啡,则建议面包圈.

You do want to consider rules such as 2,3 -> 5 because they may have higher confidence. I.e. if the customer buys gas and coffee, suggest a bagel.

频率不足以推荐!考虑在80%的情况下购买了2和3. 2、3、5在60%的情况下被购买.天真的,在8次中有6次,客户还将购买5次,这是正确率的75%!但这不是 并不意味着5是一个很好的建议!因为5占总数的80%,所以如果他购买2和3,则实际上购买5的可能性要小5%,而我们在这里有负相关.这就是为什么您还需要查看电梯的原因.或类似的其他措施,有很多.

Frequency is not sufficient for recommendations! Consider 2 and 3 are bought in 80% of cases. 2, 3, 5 is bought in 60% of cases. Naively, in 6 out of 8 times, the customer will also buy 5, that's 75% correct! But this does not mean 5 is a good recommendation! Because 5 could be in 80% total, so if he bought 2 and 3, he is actually 5% less likely to buy 5, and we have a negative correlation here. That's why you need to look at lift, too. Or other measures like it, there are many.

这篇关于Apriori算法-频繁项目集生成的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆