如何在 pandas 中绘制图形计数表 [英] How to draw a graphical count table in pandas
问题描述
我有一个数据帧df,其中有两列customer1
和customer2
,它们是字符串值.我想对这两列中每对的计数数字进行正方形图形表示.
I have a dataframe df with two columns customer1
and customer2
which are string valued. I would like to make a square graphical representation of the count number for each pair from those two columns.
我能做
df[['customer1', 'customer2']].value_counts()
这将给我计数.但是我该如何制作看起来像这样的东西:
which will give me the counts. But how can I make something that looks a little like:
从结果中得到吗?
我无法提供真实的数据集,但这是一个在csv中带有三个标签的玩具示例.
I can't provide my real dataset but here is a toy example with three labels in csv.
customer1,customer2
a,b
a,c
a,c
b,a
b,c
b,c
c,c
a,a
b,c
b,c
推荐答案
更新:
是否可以对行/列进行排序,使计数最高的行 在顶部 ?在这种情况下,顺序为b,a,c
Is it possible to sort the rows/columns so the highest count rows are at the top ? In this case the order would be b,a,c
IIUC,您可以通过这种方式(在哪里)做到这一点:
IIUC you can do it this way (where ):
In [80]: x = df.pivot_table(index='customer1',columns='customer2',aggfunc='size',fill_value=0)
In [81]: idx = x.max(axis=1).sort_values(ascending=0).index
In [82]: idx
Out[82]: Index(['b', 'a', 'c'], dtype='object', name='customer1')
In [87]: sns.heatmap(x[idx].reindex(idx), annot=True)
Out[87]: <matplotlib.axes._subplots.AxesSubplot at 0x9ee3f98>
老答案:
您可以使用 heatmap()方法seaborn
模块:
In [42]: import seaborn as sns
In [43]: df
Out[43]:
customer1 customer2
0 a b
1 a c
2 a c
3 b a
4 b c
5 b c
6 c c
7 a a
8 b c
9 b c
In [44]: x = df.pivot_table(index='customer1',columns='customer2',aggfunc='size',fill_value=0)
In [45]: x
Out[45]:
customer2 a b c
customer1
a 1 1 2
b 1 0 4
c 0 0 1
In [46]: sns.heatmap(x)
Out[46]: <matplotlib.axes._subplots.AxesSubplot at 0xb150b70>
或带有注释:
In [48]: sns.heatmap(x, annot=True)
Out[48]: <matplotlib.axes._subplots.AxesSubplot at 0xc596d68>
这篇关于如何在 pandas 中绘制图形计数表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!