计算 pandas 出现次数的最有效方法是什么? [英] what is the most efficient way of counting occurrences in pandas?

查看：38 发布时间：2021/12/3 9:19:04 python pandas

本文介绍了计算 pandas 出现次数的最有效方法是什么?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个大的(大约 1200 万行)数据帧 df 说:

I have a large (about 12M rows) dataframe df with say:

df.columns = ['word','documents','frequency']

所以以下内容及时运行:

So the following ran in a timely fashion:

word_grouping = df[['word','frequency']].groupby('word')
MaxFrequency_perWord = word_grouping[['frequency']].max().reset_index()
MaxFrequency_perWord.columns = ['word','MaxFrequency']

但是，这需要很长时间才能运行:

However, this is taking an unexpected long time to run:

Occurrences_of_Words = word_grouping[['word']].count().reset_index()

我在这里做错了什么?有没有更好的方法来计算大型数据帧中的出现次数?

What am I doing wrong here? Is there a better way to count occurences in a large dataframe?

df.word.describe()

运行得很好，所以我真的没想到这个 Occurrences_of_Words 数据框需要很长时间来构建.

ran pretty well, so I really did not expect this Occurrences_of_Words dataframe to take very long to build.

ps:如果答案是显而易见的，并且你觉得有必要惩罚我提出这个问题，请附上答案.谢谢.

ps: If the answer is obvious and you feel the need to penalize me for asking this question, please include the answer as well. thank you.

计算 pandas 出现次数的最有效方法是什么? [英] what is the most efficient way of counting occurrences in pandas?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

计算 pandas 出现次数的最有效方法是什么? [英] what is the most efficient way of counting occurrences in pandas?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭