pandas 两个数据框交叉连接 [英] pandas two dataframe cross join
问题描述
我找不到关于交叉联接的任何信息,包括合并/联接或其他一些东西. 我需要使用{my function}作为myfunc处理两个数据帧. 等效于:
I can't find anything about cross join include the merge/join or some other. I need deal with two dataframe using {my function} as myfunc . the equivalent of :
{
for itemA in df1.iterrows():
for itemB in df2.iterrows():
t["A"] = myfunc(itemA[1]["A"],itemB[1]["A"])
}
等同于:
{
select myfunc(df1.A,df2.A),df1.A,df2.A from df1,df2;
}
但是我需要更有效的解决方案: 如果使用的话,我将如何实现它们thx; ^^
but I need more efficient solution: if used apply i will be how to implement them thx;^^
推荐答案
有关交叉产品,请参见此问题.
For the cross product, see this question.
本质上,您必须进行常规合并,但为每一行赋予相同的键以进行连接,以便每一行在框架之间相互连接.
Essentially, you have to do a normal merge but give every row the same key to join on, so that every row is joined to each other across the frames.
然后您可以通过应用函数将列添加到新框架:
You can then add a column to the new frame by applying your function:
new_df = pd.merge(df1, df2, on=key)
new_df.new_col = newdf.apply(lambda row: myfunc(row['A_x'], row['A_y']), axis=1)
axis=1
强制.apply
在各行中工作.如果合并的框架像您的示例一样共享一列,则"A_x"和"A_y"将是结果框架中的默认列名称.
axis=1
forces .apply
to work across the rows. 'A_x' and 'A_y' will be the default column names in the resulting frame if the merged frames share a column like in your example.
这篇关于 pandas 两个数据框交叉连接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!