pandas :一种有效的方法来检查A列中的值是否在B列中的值列表中 [英] Pandas: Efficient way to check if a value in column A is in a list of values in column B

查看：98 发布时间：2020/5/24 4:14:48 python list pandas contains

本文介绍了 pandas :一种有效的方法来检查A列中的值是否在B列中的值列表中的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我的初始数据框看起来像这样

my initial dataframe looks like this

 A   | B
-----------------
 'a' | ['1', 'a', 'b']        
 '1' | ['2', '5', '6']   
 'd' | ['a', 'b', 'd']        
 'y' | ['x', '1', 'y']

我想检查'a'是否在B中的相应列表中:['1'，'a'，'b']

and I want to check if 'a' is in the corresponding list in B: ['1', 'a', 'b']

我可以通过应用

df.apply(lambda row: row[['A']][0] in row[['B']][0], axis=1)

这给了我预期的结果:

[True, False, True, True]

但是根据我的真实数据(数百万行)而言，它非常繁重且需要一定时间. 有没有更有效的方法来做同样的事情? 例如使用numpy elementwise运算或其他任何方法?

but on the real data I have (millions of rows) that is very heavy and takes ages. Is there a more efficient way to do the same thing? for example using numpy elementwise operations or anything else?

推荐答案

如果将每列转换为集合，则可以使用<比较成对子集

If you convert each column to sets, you can use < to compare pairwise subsets

a = d.A.apply(lambda x: set([x]))
b = d.B.apply(set)

a < b

0     True
1    False
2     True
3     True
dtype: bool

否则，您可以将列表理解与zip

Otherwise, you can use a list comprehension with zip

[a in b for a, b in zip(d.A.values.tolist(), d.B.values.tolist())]

[True, False, True, True]

为小数据计时

timing small data

定时大数据

timing large data

这篇关于 pandas :一种有效的方法来检查A列中的值是否在B列中的值列表中的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

pandas :一种有效的方法来检查A列中的值是否在B列中的值列表中 [英] Pandas: Efficient way to check if a value in column A is in a list of values in column B

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

pandas :一种有效的方法来检查A列中的值是否在B列中的值列表中 [英] Pandas: Efficient way to check if a value in column A is in a list of values in column B

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭