pandas :在秩序上旋转 [英] pandas: pivoting on rank
本文介绍了 pandas :在秩序上旋转的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
pd.DataFrame({'id':['aaa','aaa',' abb','abb','abb','acd','acd','acd'],
'loc':['US','UK','FR','US' IN','US','CN','CN']})
id loc
0 aaa US
1 aaa UK
2 abb FR
3 abb US
4 abb IN
5 acd US
6 acd CN
7 acd CN
如何将其转换为:
id loc1 loc2 loc3
aaa US UK无
abb FR US IN
acd US CN CN
我正在寻找最惯用的方法。
解决方案
我想你可以创建新的列 cols
与 groupby
, cumcount
并转换为 string
astype
,最后使用 pivot
:
df ['cols'] =' loc'+(df.groupby('id')['id']。cumcount()+ 1).astype(str)
print df
id loc cols
0 aaa US loc1
1 aaa UK loc2
2 abb FR loc1
3 abb US loc2
4 abb IN loc3
5 acd US loc1
6 acd CN loc2
7 acd CN loc3
print df.pivot(index ='id',columns ='cols',values ='loc')
cols loc1 loc2 loc3
id
aaa美国英国无
abb FR US IN
acd US CN CN
如果你想删除索引和列na mes使用 rename_axis :
print df.pivot(index ='id',columns ='cols',values ='loc')。rename_axis(无)
.rename_axis(无,轴= 1)
loc1 loc2 loc3
aaa US UK无
abb FR US IN
acd US CN CN
所有在一起,谢谢 Colin :
print pd.pivot(df ['id'],'loc' +(df.groupby('id')。cumcount()+ 1).astype(str),df ['loc'])
.rename_axis(无)
.rename_axis(无, 1)
loc1 loc2 loc3
aaa US UK无
abb FR US IN
acd US CN CN
我尝试 排名
,但是我在版本 0.18.0
:
print df.groupby('id')['loc '] .transform(lambda x:x.rank(method ='first'))
#ValueError:首先不支持非数字数据
Given this data:
pd.DataFrame({'id':['aaa','aaa','abb','abb','abb','acd','acd','acd'],
'loc':['US','UK','FR','US','IN','US','CN','CN']})
id loc
0 aaa US
1 aaa UK
2 abb FR
3 abb US
4 abb IN
5 acd US
6 acd CN
7 acd CN
How do I pivot it to this:
id loc1 loc2 loc3
aaa US UK None
abb FR US IN
acd US CN CN
I am looking for the most idiomatic method.
解决方案
I think you can create new column cols
with groupby
, cumcount
and convert to string
by astype
, last use pivot
:
df['cols'] = 'loc' + (df.groupby('id')['id'].cumcount() + 1).astype(str)
print df
id loc cols
0 aaa US loc1
1 aaa UK loc2
2 abb FR loc1
3 abb US loc2
4 abb IN loc3
5 acd US loc1
6 acd CN loc2
7 acd CN loc3
print df.pivot(index='id', columns='cols', values='loc')
cols loc1 loc2 loc3
id
aaa US UK None
abb FR US IN
acd US CN CN
If you want remove index and columns names use rename_axis:
print df.pivot(index='id', columns='cols', values='loc').rename_axis(None)
.rename_axis(None, axis=1)
loc1 loc2 loc3
aaa US UK None
abb FR US IN
acd US CN CN
All together, thank you Colin:
print pd.pivot(df['id'], 'loc' + (df.groupby('id').cumcount() + 1).astype(str), df['loc'])
.rename_axis(None)
.rename_axis(None, axis=1)
loc1 loc2 loc3
aaa US UK None
abb FR US IN
acd US CN CN
I try rank
, but I get error in version 0.18.0
:
print df.groupby('id')['loc'].transform(lambda x: x.rank(method='first'))
#ValueError: first not supported for non-numeric data
这篇关于 pandas :在秩序上旋转的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文