pandas :在秩序上旋转 [英] pandas: pivoting on rank

查看:331
本文介绍了 pandas :在秩序上旋转的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  pd.DataFrame({'id':['aaa','aaa',' abb','abb','abb','acd','acd','acd'],
'loc':['US','UK','FR','US' IN','US','CN','CN']})

id loc
0 aaa US
1 aaa UK
2 abb FR
3 abb US
4 abb IN
5 acd US
6 acd CN
7 acd CN

如何将其转换为:

  id loc1 loc2 loc3 
aaa US UK无
abb FR US IN
acd US CN CN

我正在寻找最惯用的方法。

解决方案

我想你可以创建新的列 cols groupby cumcount 并转换为 string astype ,最后使用 pivot

  df ['cols'] =' loc'+(df.groupby('id')['id']。cumcount()+ 1).astype(str)
print df
id loc cols
0 aaa US loc1
1 aaa UK loc2
2 abb FR loc1
3 abb US loc2
4 abb IN loc3
5 acd US loc1
6 acd CN loc2
7 acd CN loc3

print df.pivot(index ='id',columns ='cols',values ='loc')
cols loc1 loc2 loc3
id
aaa美国英国无
abb FR US IN
acd US CN CN

如果你想删除索引和列na mes使用 rename_axis

  print df.pivot(index ='id',columns ='cols',values ='loc')。rename_axis(无)
.rename_axis(无,轴= 1)
loc1 loc2 loc3
aaa US UK无
abb FR US IN
acd US CN CN

所有在一起,谢谢 Colin

  print pd.pivot(df ['id'],'loc' +(df.groupby('id')。cumcount()+ 1).astype(str),df ['loc'])
.rename_axis(无)
.rename_axis(无, 1)

loc1 loc2 loc3
aaa US UK无
abb FR US IN
acd US CN CN

我尝试 排名 ,但是我在版本 0.18.0

  print df.groupby('id')['loc '] .transform(lambda x:x.rank(method ='first'))
#ValueError:首先不支持非数字数据


Given this data:

pd.DataFrame({'id':['aaa','aaa','abb','abb','abb','acd','acd','acd'],
              'loc':['US','UK','FR','US','IN','US','CN','CN']})

    id loc
0  aaa  US
1  aaa  UK
2  abb  FR
3  abb  US
4  abb  IN
5  acd  US
6  acd  CN
7  acd  CN

How do I pivot it to this:

 id   loc1   loc2   loc3
aaa    US     UK     None
abb    FR     US      IN
acd    US     CN      CN

I am looking for the most idiomatic method.

解决方案

I think you can create new column cols with groupby, cumcount and convert to string by astype, last use pivot:

df['cols'] = 'loc' + (df.groupby('id')['id'].cumcount() + 1).astype(str)
print df
    id loc  cols
0  aaa  US  loc1
1  aaa  UK  loc2
2  abb  FR  loc1
3  abb  US  loc2
4  abb  IN  loc3
5  acd  US  loc1
6  acd  CN  loc2
7  acd  CN  loc3

print df.pivot(index='id', columns='cols', values='loc')
cols loc1 loc2  loc3
id                  
aaa    US   UK  None
abb    FR   US    IN
acd    US   CN    CN

If you want remove index and columns names use rename_axis:

print df.pivot(index='id', columns='cols', values='loc').rename_axis(None)
                                                        .rename_axis(None, axis=1)
    loc1 loc2  loc3
aaa   US   UK  None
abb   FR   US    IN
acd   US   CN    CN

All together, thank you Colin:

print pd.pivot(df['id'], 'loc' + (df.groupby('id').cumcount() + 1).astype(str), df['loc'])
        .rename_axis(None)
        .rename_axis(None, axis=1)

    loc1 loc2  loc3
aaa   US   UK  None
abb   FR   US    IN
acd   US   CN    CN    

I try rank, but I get error in version 0.18.0:

print df.groupby('id')['loc'].transform(lambda x: x.rank(method='first'))
#ValueError: first not supported for non-numeric data

这篇关于 pandas :在秩序上旋转的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆