pandas -按行交错/压缩两个DataFrame [英] Pandas - Interleave / Zip two DataFrames by row

查看:98
本文介绍了 pandas -按行交错/压缩两个DataFrame的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有两个数据帧:

>> df1

   0  1  2
0  a  b  c
1  d  e  f

>> df2

   0  1  2
0  A  B  C
1  D  E  F

如何交织行?,例如:

>> interleaved_df

   0  1  2
0  a  b  c
1  A  B  C
2  d  e  f
3  D  E  F

(请注意,我的实际DF具有相同的列,但行数不同)。

(Note my real DFs have identical columns, but not the same number of rows).

这个问题(非常相似,但在上提问):

inspired by this question (very similar, but asks on columns):

import pandas as pd
from itertools import chain, zip_longest

df1 = pd.DataFrame([['a','b','c'], ['d','e','f']])  
df2 = pd.DataFrame([['A','B','C'], ['D','E','F']])

concat_df = pd.concat([df1,df2])

new_index = chain.from_iterable(zip_longest(df1.index, df2.index))
# new_index now holds the interleaved row indices

interleaved_df = concat_df.reindex(new_index)

ValueError: cannot reindex from a duplicate axis

最后一次通话失败,因为df1和df2具有一些相同的索引值(我的实际DF也是这种情况)。

The last call fails because df1 and df2 have some identical index values (which is also the case with my real DFs).

有什么想法吗?

推荐答案

可以在连接后对索引进行排序然后重置索引,即

You can sort the index after concatenating and then reset the index i.e

import pandas as pd

df1 = pd.DataFrame([['a','b','c'], ['d','e','f']])  
df2 = pd.DataFrame([['A','B','C'], ['D','E','F']])

concat_df = pd.concat([df1,df2]).sort_index().reset_index(drop=True)

输出:


   0  1  2
0  a  b  c
1  A  B  C
2  d  e  f
3  D  E  F

编辑(OmerB):如果不考虑索引值而保持顺序。

EDIT (OmerB) : Incase of keeping the order regardless of the index value then.

import pandas as pd
df1 = pd.DataFrame([['a','b','c'], ['d','e','f']]).reset_index()  
df2 = pd.DataFrame([['A','B','C'], ['D','E','F']]).reset_index()

concat_df = pd.concat([df1,df2]).sort_index().set_index('index')

这篇关于 pandas -按行交错/压缩两个DataFrame的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆