检查两列之间的一对一关系 [英] Check one-on-one relationship between two columns

查看:81
本文介绍了检查两列之间的一对一关系的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在熊猫数据框中有两列A和B,其中值重复多次.对于A中的唯一值,预计B也将具有另一个"唯一值. A的每个唯一值在B中都有一个对应的唯一值(请参见下面的示例,以两个列表的形式).但是由于每列中的每个值都重复多次,所以我想检查两列之间是否存在一对一的关系.大熊猫中是否有任何内置功能可以检查?如果没有,是否有一种有效的方法来完成该任务?

I have two columns A and B in a pandas dataframe, where values are repeated multiple times. For a unique value in A, B is expected to have "another" unique value too. And each unique value of A has a corresponding unique value in B (See example below in the form of two lists). But since each value in each column is repeated multiple times, I would like to check if any one-to-one relationship exists between two columns or not. Is there any inbuilt function in pandas to check that? If not, is there an efficient way of achieving that task?

示例:

A = [1, 3, 3, 2, 1, 2, 1, 1]
B = [5, 12, 12, 10, 5, 10, 5, 5]

在这里,对于A中的每个1,B中的对应值始终为5,除此之外没有其他值.类似地,对于2-> 10,对于3-> 12.因此,A中的每个数字在B中只有一个/唯一的对应数字(而没有其他数字).我称这种一对一的关系.现在,我要检查pandas数据框中的两列之间是否存在这种关系.

Here, for each 1 in A, the corresponding value in B is always 5, and nothing else. Similarly, for 2-->10, and for 3-->12. Hence, each number in A has only one/unique corresponding number in B (and no other number). I have called this one-on-one relationship. Now I want to check if such relationship exists between two columns in pandas dataframe or not.

不满足此关系的示例:

A = [1, 3, 3, 2, 1, 2, 1, 1]
B = [5, 12, 12, 10, 5, 10, 7, 5]

在这里,A中的1在B中没有唯一的对应值.它具有两个对应的值-5和7.因此,不满足该关系.

Here, 1 in A doesn't have a unique corresponding value in B. It has two corresponding values - 5 and 7. Hence, the relationship is not satisfied.

推荐答案

考虑到您有一些数据框:

Consider you have some dataframe:

 d = df({'A': [1, 3, 1, 2, 1, 3, 2], 'B': [4, 6, 4, 5, 4, 6, 5]})

d具有groupby方法,该方法返回 GroupBy对象.例如,这是用于按相等的列值对一些行进行分组的接口.

d has groupby method, which returns GroupBy object. This is the interface to group some rows by equal column value, for example.

 gb = d.groupby('A')
 grouped_b_column = gb['B']

在分组的行上,您可以执行聚合.让我们在每个组中找到最小值和最大值.

On grouped rows you could perform an aggregation. Lets find min and max value in every group.

res = grouped_b_column.agg([np.min, np.max])

>>> print(res)
   amin  amax
A            
1     4     4
2     5     5
3     6     6

现在我们只需要检查每个组中的aminamax是否相等,所以每个组都由相等的B字段组成:

Now we just should check that amin and amax are equal in every group, so every group consists of equal B fields:

res['amin'].equals(res['amax'])

如果此检查正常,则对于每个A,您都有唯一的B.现在,您应该检查交换AB列的相同条件.

If this check is OK, then for every A you have unique B. Now you should check the same criteria for A and B columns swapped.

这篇关于检查两列之间的一对一关系的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆