pandas 两个数据框交叉连接 [英] pandas two dataframe cross join

查看:67
本文介绍了 pandas 两个数据框交叉连接的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我找不到关于交叉联接的任何信息,包括合并/联接或其他一些东西. 我需要使用{my function}作为myfunc处理两个数据帧. 等效于:

I can't find anything about cross join include the merge/join or some other. I need deal with two dataframe using {my function} as myfunc . the equivalent of :

{
    for itemA in df1.iterrows():
           for itemB in df2.iterrows():
                       t["A"] = myfunc(itemA[1]["A"],itemB[1]["A"])
 }      

等同于:

{
 select myfunc(df1.A,df2.A),df1.A,df2.A from df1,df2;
}

但是我需要更有效的解决方案: 如果使用的话,我将如何实现它们thx; ^^

but I need more efficient solution: if used apply i will be how to implement them thx;^^

推荐答案

有关交叉产品,请参见此问题.

For the cross product, see this question.

本质上,您必须进行常规合并,但为每一行赋予相同的键以进行连接,以便每一行在框架之间相互连接.

Essentially, you have to do a normal merge but give every row the same key to join on, so that every row is joined to each other across the frames.

然后您可以通过应用函数将列添加到新框架:

You can then add a column to the new frame by applying your function:

new_df = pd.merge(df1, df2, on=key)
new_df.new_col = newdf.apply(lambda row: myfunc(row['A_x'], row['A_y']), axis=1)

axis=1强制.apply在各行中工作.如果合并的框架像您的示例一样共享一列,则"A_x"和"A_y"将是结果框架中的默认列名称.

axis=1 forces .apply to work across the rows. 'A_x' and 'A_y' will be the default column names in the resulting frame if the merged frames share a column like in your example.

这篇关于 pandas 两个数据框交叉连接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆