计算Dataframe Pandas中句子中最常见的100个单词 [英] Count most frequent 100 words from sentences in Dataframe Pandas
本文介绍了计算Dataframe Pandas中句子中最常见的100个单词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我在Pandas数据框中的一栏中有文字评论,我想用频率计数来计数N个最频繁出现的单词(整列-不在单个单元格中).一种方法是使用计数器,通过遍历每一行来对单词进行计数.有更好的选择吗?
I have text reviews in one column in Pandas dataframe and I want to count the N-most frequent words with their frequency counts (in whole column - NOT in single cell). One approach is Counting the words using a counter, by iterating through each row. Is there a better alternative?
代表性数据.
0 a heartening tale of small victories and endu
1 no sophomore slump for director sam mendes w
2 if you are an actor who can relate to the sea
3 it's this memory-as-identity obviation that g
4 boyd's screenplay ( co-written with guardian
推荐答案
from collections import Counter
Counter(" ".join(df["text"]).split()).most_common(100)
我非常确定会给您您想要的东西(您可能必须在调用most_common之前从计数器结果中删除一些非单词)
im pretty sure would give you what you want (you might have to remove some non-words from the counter result before calling most_common)
这篇关于计算Dataframe Pandas中句子中最常见的100个单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文