使用 pandas 合并具有不同尺寸的多个数据框 [英] Merge multiple data frames with different dimensions using Pandas
本文介绍了使用 pandas 合并具有不同尺寸的多个数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有以下数据帧(实际上,它们超过3个).
I have the following data frames (in reality they are more than 3).
import pandas as pd
df1 = pd.DataFrame({'head1': ['foo', 'bix', 'bar'],'val': [11, 22, 32]})
df2 = pd.DataFrame({'head2': ['foo', 'xoo', 'bar','qux'],'val': [1, 2, 3,10]})
df3 = pd.DataFrame({'head3': ['xoo', 'bar',],'val': [20, 100]})
# Note that the value in column 'head' is always unique
我要做的是基于head
列合并它们.每当一个数据帧中不存在head
的值时,我们都将其分配为NA.
What I want to do is to merge them based on head
column. And whenever the value of a head
does not exist in one data frame we would assign it with NA.
最后看起来像这样:
head1 head2 head3
-------------------------------
foo 11 1 NA
bix 22 NA NA
bar 32 3 100
xoo NA 2 20
qux NA 10 NA
如何使用Pandas实现这一目标?
How can I achieve that using Pandas?
推荐答案
您可以使用 pandas.concat
选择axis=1
来连接多个DataFrame.
You can use pandas.concat
selecting the axis=1
to concatenate your multiple DataFrames.
但是请注意,我首先将df1, df2, df3
的索引设置为使用变量(foo,bar等)而不是默认整数.
Note however that I've first set the index of the df1, df2, df3
to use the variables (foo, bar, etc) rather than the default integers.
import pandas as pd
df1 = pd.DataFrame({'head1': ['foo', 'bix', 'bar'],'val': [11, 22, 32]})
df2 = pd.DataFrame({'head2': ['foo', 'xoo', 'bar','qux'],'val': [1, 2, 3,10]})
df3 = pd.DataFrame({'head3': ['xoo', 'bar',],'val': [20, 100]})
df1 = df1.set_index('head1')
df2 = df2.set_index('head2')
df3 = df3.set_index('head3')
df = pd.concat([df1, df2, df3], axis = 1)
columns = ['head1', 'head2', 'head3']
df.columns = columns
print(df)
head1 head2 head3
bar 32 3 100
bix 22 NaN NaN
foo 11 1 NaN
qux NaN 10 NaN
xoo NaN 2 20
这篇关于使用 pandas 合并具有不同尺寸的多个数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文