逐行比较来自不同数据框的值,python [英] comparing values from different dataframes line by line, python

查看:81
本文介绍了逐行比较来自不同数据框的值,python的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个不同行数的数据框。

I have two dataframes with different numbers of lines.

X& Y是2D位置的坐标

DF1:

X,Y,C
1,1,12
2,2,22
3,3,33
4,4,45
5,5,43
6,6,56

DF2:
X,Y 开始挤压下两个 X,Y END挤压

DF2: X,Ystart squere next two X,Y END squere

X,Y,X1,Y1
1,1,3,3
2,2,4,4

我的代码的一部分

A = (abs(DF1['X']).values > abs(DF2['X']).values)
B = (abs(DF1['Y']).values > abs(DF2['Y']).values)
C = (abs(DF1['X']).values < abs(DF2['X1']).values)
D = (abs(DF1['Y']).values < abs(DF2['Y1']).values)
RESULT = A & B & C & D
result=DF1[RESULT]

ALSO:我只能使用DF2中的2列,结果将仅使用A& B,这是唯一的例子。现在是X和Y的2倍,向我显示了值的范围。

ALSO: i can use only 2 columns from DF2, and in RESULT will be used only A & B, its only example. Right now 2times X and Y showing me the range of values.

当DF2只有一行时,就可以了。但是我收到的不止一个:
ValueError:操作数不能与形状一起广播
我知道我需要创建一个规则所有行都会进行比较,但是我不知道如何,我已经尝试过使用diff,但是没有很好的结果。

When DF2 have only one line, there is OK. But with more than one i have received: ValueError: operands could not be broadcast together with shapes I know that i need to create a rule that all lines will be compared, but i don't know how, i have tried with diff, but no good results.

输出:
我需要删除这个错误,开始逐行使用。
对于DF2中的每一行,我需要单独的结果:第1行的

OUTPUT: I need to delete this error and start using line by line. For each line in DF2 i need separate result: for line 1:

X,Y,C
2,2,22

对于第2行

X,Y,C
3,3,33

在每次检查该行之后,我需要将数据帧结果保存到一个文件
中,因此在此示例中,在一个文件中将有``

And after each checking the line i need to save dataframes results to one file So in this example in one file there will be``

2,2,22
3,3,33

感谢建议

编辑:
for Tbaki

for Tbaki

def isInSquare(row, df2):
    df2=result_from_other_def1.df1
    df1=result_from_other_def2.df2

    if (row.X > df2.iloc[0].X) and (row.Y > df2.iloc[1].Y):
        if (row.X < df2.iloc[0].X1) and (row.Y < df2.iloc[1].Y2):
            if (row.X < df2.iloc[1].X) and (row.Y < df2.iloc[1].Y):
                if (row.X > df2.iloc[0].X) and (row.Y > df2.iloc[1].Y2):
                    return True
    return False

DF1.apply(lambda x:isInSquare(x,DF2) ,axis = 1)#如果我在这里离开这一行,tk inter将自动运行它,所以我认为这应该在内部定义中。
另外我也不知道DF1和DF2中会有多少行。
谢谢

DF1.apply(lambda x: isInSquare(x,DF2),axis= 1)# if i will leave this line here, tk inter will run it automaticly so i my opiniot this should be inside definition. Also i dont know how many lines will be in DF1 and in DF2. Thanks

推荐答案

检查此代码,检查5x5正方形。

Check this code, checking for a 5x5 square.

DF1 = pd.DataFrame({"X":[1,2,3,4,5,6],"Y":[1,2,3,4,5,6],"C":[12,22,33,45,13,56]})
DF2 = pd.DataFrame({"X":[1,5],"Y":[1,1],"X1":[5,1],"Y1":[5,5]})

def isInSquare(row, df2):
    c1 =  (row.X > df2.iloc[0].X) and (row.Y > df2.iloc[0].Y)
    c1 = c1 and  (row.X < df2.iloc[0].X1) and (row.Y < df2.iloc[0].Y1)
    c1 = c1 and (row.X < df2.iloc[1].X) and (row.Y > df2.iloc[1].Y)               
    c1 = c1 and (row.X > df2.iloc[1].X1) and (row.Y < df2.iloc[1].Y1)
    return c1    

DF_NEW = DF1[DF1.apply(lambda x: isInSquare(x,DF2),axis= 1)]

输出

    C   X   Y
1   22  2   2
2   33  3   3
3   45  4   4

如果要保持最大值C:

DF_NEW = DF_NEW.groupby(["X","Y"]).max().reset_index()

这篇关于逐行比较来自不同数据框的值,python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆