如何获得数据框中所有唯一的单词？ [英] How to get all the unique words in the data frame?

查看：47 发布时间：2020/10/10 20:14:01 python pandas dataframe count

本文介绍了如何获得数据框中所有唯一的单词？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个带有产品列表及其相应评论的数据框

I have a dataframe with a list of products and its respective review

+ --------- + -------- ---------------------------------------- +

|产品|评论|

+ --------- + ------------------------------ ------------------ +

| product_a |休闲午餐很好|

+ --------- + ------------------------- ----------------------- +

| product_b |艾利（Avery）是最知名的咖啡师之一|

+ --------- + ----------------------- ------------------------- +

| product_c |导游告诉我们的秘密|

+ --------- + ------------------------ ------------------------ +

如何获取数据框？

我做了一个函数：

def count_words(text):
    try:
        text = text.lower()
        words = text.split()
        count_words = Counter(words)
    except Exception, AttributeError:
        count_words = {'':0}
    return count_words

并应用

And applied the function to the DataFrame, but that only gives me the words count for each row.

reviews['words_count'] = reviews['review'].apply(count_words)

推荐答案

dfx
               review
0      United Kingdom
1  The United Kingdom
2     Dublin, Ireland
3    Mardan, Pakistan

要获取评论列中的所有单词：

To get all words in the "review" column:

 list(dfx['review'].str.split(' ', expand=True).stack().unique())

   ['United', 'Kingdom', 'The', 'Dublin,', 'Ireland', 'Mardan,', 'Pakistan']

要获取评论列的计数：

dfx['review'].str.split(' ', expand=True).stack().value_counts()


United      2
Kingdom     2
Mardan,     1
The         1
Ireland     1
Dublin,     1
Pakistan    1
dtype: int64

这篇关于如何获得数据框中所有唯一的单词？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何获得数据框中所有唯一的单词？ [英] How to get all the unique words in the data frame?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何获得数据框中所有唯一的单词？ [英] How to get all the unique words in the data frame?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭