pandas -按行交错/压缩两个DataFrame [英] Pandas - Interleave / Zip two DataFrames by row
本文介绍了 pandas -按行交错/压缩两个DataFrame的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
假设我有两个数据帧:
>> df1
0 1 2
0 a b c
1 d e f
>> df2
0 1 2
0 A B C
1 D E F
如何交织行?,例如:
>> interleaved_df
0 1 2
0 a b c
1 A B C
2 d e f
3 D E F
(请注意,我的实际DF具有相同的列,但行数不同)。
(Note my real DFs have identical columns, but not the same number of rows).
受这个问题(非常相似,但在列上提问):
inspired by this question (very similar, but asks on columns):
import pandas as pd
from itertools import chain, zip_longest
df1 = pd.DataFrame([['a','b','c'], ['d','e','f']])
df2 = pd.DataFrame([['A','B','C'], ['D','E','F']])
concat_df = pd.concat([df1,df2])
new_index = chain.from_iterable(zip_longest(df1.index, df2.index))
# new_index now holds the interleaved row indices
interleaved_df = concat_df.reindex(new_index)
ValueError: cannot reindex from a duplicate axis
最后一次通话失败,因为df1和df2具有一些相同的索引值(我的实际DF也是这种情况)。
The last call fails because df1 and df2 have some identical index values (which is also the case with my real DFs).
有什么想法吗?
推荐答案
可以在连接后对索引进行排序然后重置索引,即
You can sort the index after concatenating and then reset the index i.e
import pandas as pd
df1 = pd.DataFrame([['a','b','c'], ['d','e','f']])
df2 = pd.DataFrame([['A','B','C'], ['D','E','F']])
concat_df = pd.concat([df1,df2]).sort_index().reset_index(drop=True)
输出:
0 1 2
0 a b c
1 A B C
2 d e f
3 D E F
编辑(OmerB):如果不考虑索引值而保持顺序。
EDIT (OmerB) : Incase of keeping the order regardless of the index value then.
import pandas as pd
df1 = pd.DataFrame([['a','b','c'], ['d','e','f']]).reset_index()
df2 = pd.DataFrame([['A','B','C'], ['D','E','F']]).reset_index()
concat_df = pd.concat([df1,df2]).sort_index().set_index('index')
这篇关于 pandas -按行交错/压缩两个DataFrame的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文