将数据框中的多列展平为单列 [英] Flatten multiple columns in a dataframe to a single column
本文介绍了将数据框中的多列展平为单列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个像这样的数据框:
I have a dataframe like this:
id other_id_1 other_id_2 other_id_3
1 100 101 102
2 200 201 202
3 300 301 302
我想要这个:
id other_id
1 100
1 101
1 102
2 200
2 201
2 202
3 300
3 301
3 302
我可以像这样轻松获得所需的输出:
I can get my desired output easily like this:
to_keep = {}
for idx in df.index:
identifier = df.loc[idx]['id']
to_keep[identifier] = []
for col in ['other_id_1', 'other_id_2', 'other_id_3']:
row_val = df.loc[idx][col]
to_keep[identifier].append(row_val)
哪个给我这个:
{1: [100, 101, 102], 2: [200, 201, 202], 3: [300, 301, 302]}
我可以轻松地将其写入文件.但是,我正在努力在大熊猫中做到这一点.我以为这种看似换位的方式会更简单,但是却很挣扎...
I can easily write that to a file. I am struggling to do this in native pandas, however. I would imagine this seeming transposition would be more straightforward, but am struggling...
推荐答案
好吧,如果还没有,请将id
设置为索引:
Well, if you haven't already, set id
as the index:
>>> df
id other_id_1 other_id_2 other_id_3
0 1 100 101 102
1 2 200 201 202
2 3 300 301 302
>>> df.set_index('id', inplace=True)
>>> df
other_id_1 other_id_2 other_id_3
id
1 100 101 102
2 200 201 202
3 300 301 302
然后,您只需使用pd.concat
:
>>> df = pd.concat([df[col] for col in df])
>>> df
id
1 100
2 200
3 300
1 101
2 201
3 301
1 102
2 202
3 302
dtype: int64
如果需要对值进行排序:
And if you need the values sorted:
>>> df.sort_values()
id
1 100
1 101
1 102
2 200
2 201
2 202
3 300
3 301
3 302
dtype: int64
>>>
这篇关于将数据框中的多列展平为单列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文