pandas :如何将列表转换为按列分组的矩阵? [英] Pandas: how to convert a list into a matrix grouped by a column?

查看:64
本文介绍了 pandas :如何将列表转换为按列分组的矩阵?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个pandas数据框,其中第一列(CUSTOMER)是客户的名称,并且客户的名称对于客户购买的每种产品(PRODUCT)重复一次:

I have a pandas dataframe where the first column (CUSTOMER) is the name of the customer and the customer's name is repeated once for every product the customer has purchased (PRODUCT):

Customer  Product  Count
John      A        1
John      B        1
John      C        1
Mary      A        1
Mary      B        1
Charles   A        1

我想透视此数据以创建一个新的数据框,其中行和列都是产品类别(PRODUCT),值是客户名称的数量,如下所示:

I want to pivot this data to create a new dataframe where both rows and columns are the category of product (PRODUCT) and the values are the count of the customer name, as follows:

Product
       A     B     C
A      0     2     1
B      2     0     1
C      1     1     0

因此,如果约翰购买了A并同时购买了B,则+1将被添加到A:B单元格中,他也同时购买了A和C,因此A:C单元格上有一个+1,依此类推.请注意,Charles不会出现在此数据框中,因为他只购买了一种产品.

So if John bought A and also bought B, +1 will be added to the A:B cell, he also bought A in combination with C, so there is a +1 on the A:C cell, and so on. Note that Charles does not appear in this dataframe because he only bought one product.

我尝试使用pandas.pivot_table,但这是我得到的:

I tried to use pandas.pivot_table but this is what I got:

df = pd.pivot_table(df, index=['Product'], columns=['Product'], values=['Customer'])

>> KeyError: 'Level Product not found'

我应该使用什么方法和参数?

What method and parameters should I use?

推荐答案

带有crosstab

d1 = df.merge(df, on='Customer').query('Product_x != Product_y')
pd.crosstab(d1.Product_x, d1.Product_y)

Product_y  A  B  C
Product_x         
A          0  2  1
B          2  0  1
C          1  1  0


您可以查看此答案,以更好地了解如何加快crosstab的速度.该问题的关键见解是自我合并.


You can see this answer to get a better idea how to speed the crosstab up. The key insight for this problem was the self merging.

这篇关于 pandas :如何将列表转换为按列分组的矩阵?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆