计算大 pandas 数量的最有效方法是什么? [英] what is the most efficient way of counting occurrences in pandas?

查看：86 发布时间：2020/5/23 21:19:09 python pandas

本文介绍了计算大 pandas 数量的最有效方法是什么?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个大的(约1200万行)数据帧df，说:

I have a large (about 12M rows) dataframe df with say:

df.columns = ['word','documents','frequency']

因此，及时执行了以下操作:

So the following ran in a timely fashion:

word_grouping = df[['word','frequency']].groupby('word')
MaxFrequency_perWord = word_grouping[['frequency']].max().reset_index()
MaxFrequency_perWord.columns = ['word','MaxFrequency']

但是，这要花费很长的时间才能运行:

However, this is taking an unexpected long time to run:

Occurrences_of_Words = word_grouping[['word']].count().reset_index()

我在这里做错了什么?有没有更好的方法来计算大型数据框中的出现次数?

What am I doing wrong here? Is there a better way to count occurences in a large dataframe?

df.word.describe()

运行得很好，所以我真的没想到Occurrences_of_Words数据框会花很长时间构建.

ran pretty well, so I really did not expect this Occurrences_of_Words dataframe to take very long to build.

ps:如果答案很明显，并且您觉得有必要因提出这个问题而对我不利，请同时提供答案.谢谢.

ps: If the answer is obvious and you feel the need to penalize me for asking this question, please include the answer as well. thank you.

计算大 pandas 数量的最有效方法是什么? [英] what is the most efficient way of counting occurrences in pandas?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

计算大 pandas 数量的最有效方法是什么? [英] what is the most efficient way of counting occurrences in pandas?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭