当列数据类型为列表时如何在pandas数据帧上进行过滤 [英] How to filter on pandas dataframe when column data type is a list

查看:56
本文介绍了当列数据类型为列表时如何在pandas数据帧上进行过滤的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在过滤数据类型为列表的列(我们将其命名为column_1)上的熊猫数据框时遇到了一些麻烦.具体来说,我只想返回使得column_1和另一个预定列表的交集不为空的行.但是,当我尝试将逻辑放在.where函数的参数中时,总是会出错.以下是我的尝试,并返回了错误.

I am having some trouble filtering a pandas dataframe on a column (let's call it column_1) whose data type is a list. Specifically, I want to return only rows such that column_1 and the intersection of another predetermined list are not empty. However, when I try to put the logic inside the arguments of the .where, function, I always get errors. Below are my attempts, with the errors returned.

  • 尝试测试列表中是否包含单个元素:

  • Attemping to test whether or not a single element is inside the list:

表[表中的元素['column_1']] 返回错误... KeyError:False

尝试将列表与数据框行中的所有列表进行比较:

trying to compare a list to all of the lists in the rows of the dataframe:

table [[349569] == table.column_1] 返回错误数组的长度不同:23041 vs 1

在测试两个列表的交集之前,我试图降低这两个中间步骤.

I'm trying to get these two intermediate steps down before I test the intersection of the two lists.

感谢您抽出宝贵的时间来阅读我的问题!

Thanks for taking the time to read over my problem!

推荐答案

考虑 pd.Series s

s = pd.Series([[1, 2, 3], list('abcd'), [9, 8, 3], ['a', 4]])
print(s)

0       [1, 2, 3]
1    [a, b, c, d]
2       [9, 8, 3]
3          [a, 4]
dtype: object

和测试列表 test

test = ['b', 3, 4]

应用一个 lambda 函数,该函数使用 test 将每个 s 的元素转换为集合和 intersection >

Apply a lambda function that converts each element of s to a set and intersection with test

print(s.apply(lambda x: list(set(x).intersection(test))))

0    [3]
1    [b]
2    [3]
3    [4]
dtype: object

要将其用作遮罩,请使用 bool 而不是 list

To use it as a mask, use bool instead of list

s.apply(lambda x: bool(set(x).intersection(test)))

0    True
1    True
2    True
3    True
dtype: bool

这篇关于当列数据类型为列表时如何在pandas数据帧上进行过滤的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆