将多个DataFrame与非标准列连接 [英] Concatenating Multiple DataFrames with Non-Standard Columns

查看:91
本文介绍了将多个DataFrame与非标准列连接的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有一种很好的方法来串联一个DataFrame列表,而这些DataFrame之间的列不规则?

Is there a good way to concatenate a list of DataFrames where the columns are not regular between DataFrames?

期望的结果是匹配所有匹配的列,但将不匹配的列保留在一边.您希望保留不匹配列的原因是,尽管列表中第一和第二数据帧之间的给定列可能不匹配,但第一和第三数据帧之间可能存在匹配.因此,在第一次缺少匹配时过早地丢弃将是不理想的.

The desired outcome is to match up all columns that are a match but to keep the ones that have no match off to the side. The reason you would want to keep the unmatched columns is because while there may not be a match on a given column between the 1st and 2nd dataframes in the list there may be a match between the 1st and 3rd. Thus discarding prematurely on the first lack of match would not be ideal.

示例是:

print list(datalist[0].columns)
>>>[u'1', u'2', u'3']

print list(datalist[1].columns)
>>>[u'1', u'2', u'4']

print list(datalist[2].columns)
>>>[u'2', u'3', u'4']

其中的输出将是一个数据帧,如(在此处用样式表示):

Where the output would be a dataframe like (stylistically represented here):

1 2 3 - 
1 2 - 4
- 2 3 4

推荐答案

data=pd.concat(datalist,join='outer', axis=0, ignore_index=True)

这有效.最初,我的印象是,应用了join ="outer"参数的concat只会在不考虑列名的情况下直接追加内容.实际上,当应用join ="outer"参数时,它将合并它可以匹配的匹配列,但随后将所有不匹配的列都保留在DF一侧,这正是所需的.希望这对其他人有帮助.

This works. I was originally under the impression that concat with the join="outer" argument applied would just append straight up and down without regard to column names. Actually, when the join="outer" argument is applied it will combine what matching columns it can but then keep all of the non-matched columns off to the side of the DF, which is exactly what is desired. Hope this helps someone else.

这篇关于将多个DataFrame与非标准列连接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆