在大 pandas FrozenSet中查找子串 [英] Finding substring in pandas frozenset
问题描述
我正在尝试在冻结集中找到一个子字符串,但是我有点没办法了.
I'm trying to find a substring in a frozenset, however I'm a bit out of options.
我的数据结构是pandas.dataframe(如果您熟悉的话,它来自mlxtend
包的association_rules
),并且我想打印出所有行(包括冻结集)的行.特定的字符串.
My data structure is a pandas.dataframe (it's from the association_rules
from the mlxtend
package if you are familiar with that one) and I want to print all the rows where the antecedents (which is a frozenset) include a specific string.
样本数据:
print(rules[rules["antecedents"].str.contains('line', regex=False)])
但是,每当我运行它时,我都会得到一个空的数据框.
However whenever I run it, I get an Empty Dataframe.
当我尝试仅在我的rules["antecedents"]
系列上运行内部函数时,对于所有条目我只会得到False值.但是为什么呢?
When I try running only the inner function on my series of rules["antecedents"]
, I get only False values for all entries. But why is that?
推荐答案
因为dataframe.str.*
函数仅适用于字符串数据.由于您的数据不是字符串,因此无论其字符串表示形式如何,它始终为NaN.证明:
Because dataframe.str.*
functions are for string data only. Since your data is not string, it will always be NaN regardless the string representation of it. To prove:
>>> x = pd.DataFrame(np.random.randn(2, 5)).astype("object")
>>> x
0 1 2 3 4
0 -1.17191 -1.92926 -0.831576 -0.0814279 0.099612
1 -1.55183 -0.494855 1.14398 -1.72675 -0.0390948
>>> x[0].str.contains("-1")
0 NaN
1 NaN
Name: 0, dtype: float64
你能做什么:
使用apply
:
>>> x[0].apply(lambda x: "-1" in str(x))
0 True
1 True
Name: 0, dtype: bool
因此您的代码应写为:
print(rules[rules["antecedents"].apply(lambda x: 'line' in str(x))])
如果您指的是元素上的完全匹配项,则可能要使用'line' in x
You might want to use 'line' in x
if you mean an exact match on element
这篇关于在大 pandas FrozenSet中查找子串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!