Python数据框：在同一列上旋转 [英] Python dataframe: pivot on same column

查看：93 发布时间：2020/10/17 1:38:13 python python-3.x pandas dataframe

本文介绍了Python数据框：在同一列上旋转的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有两列 ID和 division，如下所示。

  df = pd.DataFrame（np.array（[[''111'，'AAA']，['222'， 'AAA']，['333'，'BBB']，['444'，'CCC']，['444'，'AAA']，['222'，'BBB']，['111'， 'BBB']]），columns = ['ID'，'division']）
 
 ID划分
 0111 AAA 
 1222 AAA 
 2333 BBB 
 3444 CCC 
 4444 AAA 
 5222 BBB 
 6111 BBB

期望的输出如下所示，在这里我需要在同一列上进行透视，但是计数取决于除法。

  df = pd.DataFrame（np.array（[['0'，'2 '，'1'，'1']，['2'，'0'，'1'，'1']，['1'，'1'，'0'，'0']，['1 '，'1'，'0'，'0']]））列= ['111'，'222'，'333'，'444']，index = ['111'，'222'，'333 '，'444']）
 
 111222333444 
 111 0 2 1 1 
 222 2 0 1 1 
 333 1 1 0 0 
 444 1 1 0 0

因此，从技术上讲，我在ID与除法之间存在重叠。 / p>

例如：
红色突出显示的框，其中111和222 ID之间的重叠为2（AAA和BBB）。其中111和444之间的重叠为1（AAA在黑框中突出显示）。

我可以分2步在excel中完成此操作。不确定下面是否有帮助。
Step1：= SUM（COUNTIFS（$ B $ 2：$ B $ 8，$ B2，$ A $ 2：$ A $ 8，$ G2），COUNTIFS（$ B $ 2：$ B $ 8， $ B2，$ A $ 2：$ A $ 8，H $ 1））-1
步骤2：= IF（$ G12 = H $ 1,0，SUMIFS（H $ 2：H $ 8，$ G $ 2：$ G $ 8，$ G12））

但是有什么办法可以做到使用数据框的Python。
感谢您的帮助

案例2

 如果df = pd.DataFrame（np.array（[['111'，'AAA'，'4']，['222'，'AAA'，'5']，['333' ，'BBB'，'6']，
 ['444'，'CCC'，'3']，['444'，'AAA'，'2']，['222'，'BBB' ，'2']，
 ['111'，'BBB'，'7']]），columns = ['ID'，'division'，'count']）
 
 ID分区计数
 0111 AAA 4 
 1222 AAA 5 
 2333 BBB 6 
 3444 CCC 3 
 4444 AAA 2 
 5222 BBB 2 
 6111 BBB 7

预期产量为

  df_result = pd.DataFrame（np.array（[['0'，'18'，'13'，'6']，['18'， '0'，'8'，'7']，['13'，'8'，'0'，'0']，['6'，'7'，'0'，'0']]） ，columns = ['111'，'222'，'333'，'444']，index = ['111'，'222'，'333'，'444']）
 
 111 222333444 
 111 0 18 13 6 
 222 18 0 8 7 
 333 13 8 0 0 
 444 6 7 0 0

计算on：在这里，关于AAA和BBB，111和222之间有重叠，因此总和为4 + 5 + 2 + 7 = 18

解决方案

另一种方法是使用带有 merge 和 pd.crosstab 的自联接：

  df_out = df.merge（df，on ='division'）
 
结果= pd .crosstab（df_out.ID_x，df_out.ID_y）
 np.fill_diagonal（results.values，0）

输出：

  ID_y 111222333444 
 ID_x 
 111 0.0 2.0 1.0 1.0 
 222 2.0 0.0 1.0 1.0 
 333 1.0 1.0 0.0 0.0 
 444 1.0 1.0 0.0 0.0

案例2

  df = pd.DataFrame（np.array（[['111'，'AAA' ，'4']，['222'，'AAA'，'5']，['333'，'BBB'，'6']，
 ['444'，'CCC'，'3' ]，['444'，'AAA'，'2']，['222'，'BBB'，'2']，
 ['111'，'BBB'，'7']]））， column = ['ID'，'division'，'count ']）
 
 df ['count'] = df ['count']。astype（int）
 df_out = df.merge（df，on ='division'）
 
 df_out = df_out.assign（count = df_out.count_x + df_out.count_y）
 
结果= pd.crosstab（df_out.ID_x，df_out.ID_y，df_out ['count']， aggfunc ='sum'）。fillna（0）
 np.fill_diagonal（results.values，0）

输出：

  ID_y 111222333444 
 ID_x 
 111 0.0 18.0 13.0 6.0 
 222 18.0 0.0 8.0 7.0 
 333 13.0 8.0 0.0 0.0 
 444 6.0 7.0 0.0 0.0

I have two columns "ID" and "division" as shown below.

df = pd.DataFrame(np.array([['111', 'AAA'],['222','AAA'],['333','BBB'],['444','CCC'],['444','AAA'],['222','BBB'],['111','BBB']]),columns=['ID','division'])

    ID  division
0   111 AAA
1   222 AAA
2   333 BBB
3   444 CCC
4   444 AAA
5   222 BBB
6   111 BBB

The expected output is as shown below where I need to pivot on the same column but the count is dependent on "division". This should be presented in a heatmap.

    df = pd.DataFrame(np.array([['0','2','1','1'],['2','0','1','1'],['1','1','0','0'],['1','1','0','0']]),columns=['111','222','333','444'],index=['111','222','333','444'])

    111 222 333 444
111 0   2   1   1
222 2   0   1   1
333 1   1   0   0
444 1   1   0   0

So, technically I am doing an overlap between ID's with respect to division.

Example: The highlighted box in red where the overlap between 111 and 222 ID's is 2(AAA and BBB). where as the overlap between 111 and 444 is 1 (AAA highlighted in the black box).

I could do this in excel in 2 steps.Not sure if below one helps. Step1:=SUM(COUNTIFS($B$2:$B$8,$B2,$A$2:$A$8,$G2),COUNTIFS($B$2:$B$8,$B2,$A$2:$A$8,H$1))-1 Step2:=IF($G12=H$1,0,SUMIFS(H$2:H$8,$G$2:$G$8,$G12))

But is there any way that we can do it in Python using dataframes. Appreciate your help

Case-2

if df = pd.DataFrame(np.array([['111', 'AAA','4'],['222','AAA','5'],['333','BBB','6'],
                            ['444','CCC','3'],['444','AAA','2'], ['222','BBB','2'],
                            ['111','BBB','7']]),columns=['ID','division','count'])

   ID   division count
0   111  AAA      4
1   222  AAA      5
2   333  BBB      6
3   444  CCC      3
4   444  AAA      2
5   222  BBB      2
6   111  BBB      7

Expected output would be

df_result = pd.DataFrame(np.array([['0','18','13','6'],['18','0','8','7'],['13','8','0','0'],['6','7','0','0']]),columns=['111','222','333','444'],index=['111','222','333','444'])

   111 222  333 444
111 0   18  13  6
222 18  0   8   7
333 13  8   0   0
444 6   7   0   0

Calculation: Here there is an overlap between 111 and 222 with respect to divisions AAA and BBB hence the sum would be 4+5+2+7=18

解决方案

Another way to do this is to use a self join with merge and pd.crosstab:

df_out = df.merge(df, on='division')

results = pd.crosstab(df_out.ID_x, df_out.ID_y) 
np.fill_diagonal(results.values, 0)

Output:

ID_y  111  222  333  444
ID_x                    
111   0.0  2.0  1.0  1.0
222   2.0  0.0  1.0  1.0
333   1.0  1.0  0.0  0.0
444   1.0  1.0  0.0  0.0

Case 2

df = pd.DataFrame(np.array([['111', 'AAA','4'],['222','AAA','5'],['333','BBB','6'],
                            ['444','CCC','3'],['444','AAA','2'], ['222','BBB','2'],
                            ['111','BBB','7']]),columns=['ID','division','count'])

df['count'] = df['count'].astype(int)
df_out = df.merge(df, on='division')

df_out = df_out.assign(count = df_out.count_x + df_out.count_y)

results = pd.crosstab(df_out.ID_x, df_out.ID_y, df_out['count'], aggfunc='sum').fillna(0) 
np.fill_diagonal(results.values, 0)

Output:

ID_y   111   222   333  444
ID_x                       
111    0.0  18.0  13.0  6.0
222   18.0   0.0   8.0  7.0
333   13.0   8.0   0.0  0.0
444    6.0   7.0   0.0  0.0

这篇关于Python数据框：在同一列上旋转的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Python数据框：在同一列上旋转 [英] Python dataframe: pivot on same column

问题描述

案例2

Case 2

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python数据框：在同一列上旋转 [英] Python dataframe: pivot on same column

问题描述

案例2

Case 2

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭