转置列时按唯一值分组 [英] Grouping by unique values while transposing column

查看:59
本文介绍了转置列时按唯一值分组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

前几天,我用来自两列的数据问了一个类似的问题:

根据Python中的唯一值对列进行分组

现在我有三列.它们需要按A列分组,其中B列作为标题值,C列正确排序.

我的数据框如下:

    A   B   C
25115  20  45
25115  30  154
25115  40  87
25115  70  21
25115  90  74
26200  10  48
26200  20  414
26200  40  21
26200  50  288
26200  80  174
26200  90  54

但是我需要结束这个:

       10   20   30   40   50   70   80   90
25115       45   154  87        21        74
26200  48   414       21   288       174  54

这将获取列C的值,但不使用列B作为行名.

import pandas as pd
df = pd.DataFrame({'A':[25115,25115,25115,25115,25115,26200,26200,26200,26200,26200,26200],'B':[20,30,40,70,90,10,20,40,50,80,90],'C':[45,154,87,21,74,48,414,21,288,174,54]})
a = df.groupby('A')['C'].apply(lambda x:' '.join(x.astype(str)))

任何想法都将不胜感激.

解决方案

  • 选项1:

使用数据透视表:

df.pivot_table(values='C',index='A',columns='B')

输出

B        10     20     30    40     50    70     80    90
A                                                        
25115   NaN   45.0  154.0  87.0    NaN  21.0    NaN  74.0
26200  48.0  414.0    NaN  21.0  288.0   NaN  174.0  54.0

  • 选项2:

使用set_index/取消堆叠:

df.set_index(['A','B'])['C'].unstack()

输出:

B        10     20     30    40     50    70     80    90
A                                                        
25115   NaN   45.0  154.0  87.0    NaN  21.0    NaN  74.0
26200  48.0  414.0    NaN  21.0  288.0   NaN  174.0  54.0

I asked a similar question the other day with data from two columns:

Grouping columns by unique values in Python

Now I have three columns. They need to be grouped by column A with column B as the header values and column C sorted properly.

My data frame looks like:

    A   B   C
25115  20  45
25115  30  154
25115  40  87
25115  70  21
25115  90  74
26200  10  48
26200  20  414
26200  40  21
26200  50  288
26200  80  174
26200  90  54

But I need to end up with this:

       10   20   30   40   50   70   80   90
25115       45   154  87        21        74
26200  48   414       21   288       174  54

This gets the values of column C, but not with column B as the row names.

import pandas as pd
df = pd.DataFrame({'A':[25115,25115,25115,25115,25115,26200,26200,26200,26200,26200,26200],'B':[20,30,40,70,90,10,20,40,50,80,90],'C':[45,154,87,21,74,48,414,21,288,174,54]})
a = df.groupby('A')['C'].apply(lambda x:' '.join(x.astype(str)))

Any ideas would be most appreciated.

解决方案

  • Option 1:

Use pivot_table:

df.pivot_table(values='C',index='A',columns='B')

Output

B        10     20     30    40     50    70     80    90
A                                                        
25115   NaN   45.0  154.0  87.0    NaN  21.0    NaN  74.0
26200  48.0  414.0    NaN  21.0  288.0   NaN  174.0  54.0

  • Option 2:

Use set_index / unstack:

df.set_index(['A','B'])['C'].unstack()

Output:

B        10     20     30    40     50    70     80    90
A                                                        
25115   NaN   45.0  154.0  87.0    NaN  21.0    NaN  74.0
26200  48.0  414.0    NaN  21.0  288.0   NaN  174.0  54.0

这篇关于转置列时按唯一值分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆