在大 pandas FrozenSet中查找子串 [英] Finding substring in pandas frozenset

查看:109
本文介绍了在大 pandas FrozenSet中查找子串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在冻结集中找到一个子字符串,但是我有点没办法了.

I'm trying to find a substring in a frozenset, however I'm a bit out of options.

我的数据结构是pandas.dataframe(如果您熟悉的话,它来自mlxtend包的association_rules),并且我想打印出所有行(包括冻结集)的行.特定的字符串.

My data structure is a pandas.dataframe (it's from the association_rules from the mlxtend package if you are familiar with that one) and I want to print all the rows where the antecedents (which is a frozenset) include a specific string.

样本数据:

    print(rules[rules["antecedents"].str.contains('line', regex=False)])

但是,每当我运行它时,我都会得到一个空的数据框.

However whenever I run it, I get an Empty Dataframe.

当我尝试仅在我的rules["antecedents"]系列上运行内部函数时,对于所有条目我只会得到False值.但是为什么呢?

When I try running only the inner function on my series of rules["antecedents"], I get only False values for all entries. But why is that?

推荐答案

因为dataframe.str.*函数仅适用于字符串数据.由于您的数据不是字符串,因此无论其字符串表示形式如何,它始终为NaN.证明:

Because dataframe.str.* functions are for string data only. Since your data is not string, it will always be NaN regardless the string representation of it. To prove:

>>> x = pd.DataFrame(np.random.randn(2, 5)).astype("object")
>>> x
         0         1         2          3          4
0 -1.17191  -1.92926 -0.831576 -0.0814279   0.099612
1 -1.55183 -0.494855   1.14398   -1.72675 -0.0390948
>>> x[0].str.contains("-1")
0   NaN
1   NaN
Name: 0, dtype: float64

你能做什么:

使用apply:

>>> x[0].apply(lambda x: "-1" in str(x))
0    True
1    True
Name: 0, dtype: bool

因此您的代码应写为:

print(rules[rules["antecedents"].apply(lambda x: 'line' in str(x))])

如果您指的是元素上的完全匹配项,则可能要使用'line' in x

You might want to use 'line' in x if you mean an exact match on element

这篇关于在大 pandas FrozenSet中查找子串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆