计算 pandas 在整个列中的字符串出现次数 [英] Count appearances of a string throughout columns in pandas

查看:138
本文介绍了计算 pandas 在整个列中的字符串出现次数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

请考虑以下数据框:

import pandas as pd
df = pd.DataFrame(["What is the answer", 
                   "the answer isn't here, but the answer is 42" , 
                   "dogs are nice", 
                   "How are you"], columns=['words'])
df
                                         words
0                           What is the answer
1  the answer isn't here, but the answer is 42
2                                dogs are nice
3                                  How are you

我想计算某个字符串出现的次数,这可能会重复每个索引中几次。

I want to count the number of appearances of a certain string, that may repeat a few times in each index.

例如,我要计算答案出现的次数。
我尝试过:

For example, I want to count the number of times the answer appears. I tried:

df.words.str.contains(r'the answer').count()

我希望有一个解决方案,但输出是 4
我不明白为什么。 答案出现3次。

Which I hoped for a solution, but the output is 4. Which I don't understand why. the answer appears 3 times.

What is **the answer**
**the answer** isn't here, but **the answer** is 42

注意:搜索字符串可能在行中出现多次

Note: search string may appear more than once in the row

推荐答案

您需要 str .count

In [5285]: df.words.str.count("the answer").sum()
Out[5285]: 3

In [5286]: df.words.str.count("the answer")
Out[5286]:
0    1
1    2
2    0
3    0
Name: words, dtype: int64

这篇关于计算 pandas 在整个列中的字符串出现次数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆