当列数据类型为列表时如何在pandas数据帧上进行过滤 [英] How to filter on pandas dataframe when column data type is a list
问题描述
我在过滤数据类型为列表的列(我们将其命名为column_1)上的熊猫数据框时遇到了一些麻烦.具体来说,我只想返回使得column_1和另一个预定列表的交集不为空的行.但是,当我尝试将逻辑放在.where函数的参数中时,总是会出错.以下是我的尝试,并返回了错误.
I am having some trouble filtering a pandas dataframe on a column (let's call it column_1) whose data type is a list. Specifically, I want to return only rows such that column_1 and the intersection of another predetermined list are not empty. However, when I try to put the logic inside the arguments of the .where, function, I always get errors. Below are my attempts, with the errors returned.
-
尝试测试列表中是否包含单个元素:
Attemping to test whether or not a single element is inside the list:
表[表中的元素['column_1']]
返回错误... KeyError:False
尝试将列表与数据框行中的所有列表进行比较:
trying to compare a list to all of the lists in the rows of the dataframe:
table [[349569] == table.column_1]
返回错误数组的长度不同:23041 vs 1
在测试两个列表的交集之前,我试图降低这两个中间步骤.
I'm trying to get these two intermediate steps down before I test the intersection of the two lists.
感谢您抽出宝贵的时间来阅读我的问题!
Thanks for taking the time to read over my problem!
推荐答案
考虑 pd.Series
s
s = pd.Series([[1, 2, 3], list('abcd'), [9, 8, 3], ['a', 4]])
print(s)
0 [1, 2, 3]
1 [a, b, c, d]
2 [9, 8, 3]
3 [a, 4]
dtype: object
和测试列表 test
test = ['b', 3, 4]
应用一个 lambda
函数,该函数使用 test
将每个 s
的元素转换为集合和 intersection
>
Apply a lambda
function that converts each element of s
to a set and intersection
with test
print(s.apply(lambda x: list(set(x).intersection(test))))
0 [3]
1 [b]
2 [3]
3 [4]
dtype: object
要将其用作遮罩,请使用 bool
而不是 list
To use it as a mask, use bool
instead of list
s.apply(lambda x: bool(set(x).intersection(test)))
0 True
1 True
2 True
3 True
dtype: bool
这篇关于当列数据类型为列表时如何在pandas数据帧上进行过滤的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!