具有重复值的数据透视表 [英] pivot dataframe with duplicate values
本文介绍了具有重复值的数据透视表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
考虑以下pd.DataFrame
temp = pd.DataFrame({'label_0':[1,1,1,2,2,2],'label_1':['a','b','c',np.nan,'c','b'], 'values':[0,2,4,np.nan,8,5]})
print(temp)
label_0 label_1 values
0 1 a 0.0
1 1 b 2.0
2 1 c 4.0
3 2 NaN NaN
4 2 c 8.0
5 2 b 5.0
我想要的输出是
label_1 1 2
0 a 0.0 NaN
1 b 2.0 5.0
2 c 4.0 8.0
3 NaN NaN NaN
我尝试了pd.pivot
并与pd.gropuby
纠缠不清,但由于条目重复而无法获得所需的输出.任何最感激的帮助.
I have tried pd.pivot
and wrangling around with pd.gropuby
but cannot get to the desired output due to duplicate entries. any help most appreciated.
推荐答案
d = {}
for _0, _1, v in zip(*map(temp.get, temp)):
d.setdefault(_1, {})[_0] = v
pd.DataFrame.from_dict(d, orient='index')
1 2
a 0.0 NaN
b 2.0 5.0
c 4.0 8.0
NaN NaN NaN
OR
pd.DataFrame.from_dict(d, orient='index').rename_axis('label_1').reset_index()
label_1 1 2
0 a 0.0 NaN
1 b 2.0 5.0
2 c 4.0 8.0
3 NaN NaN NaN
这篇关于具有重复值的数据透视表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文