分割一列字符串并计算 pandas 的单词数 [英] Splitting a column of strings and counting the number of words with pandas

查看：58 发布时间：2020/10/17 1:03:07 python string pandas dataframe

本文介绍了分割一列字符串并计算 pandas 的单词数的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

id   string   
0    31672;0           
1    31965;0
2    0;78464
3      51462
4    31931;0

我有那张桌子。我想将字符串表除以';'，然后将其存储到新列中。最后一列的存储格式如下

Hi, I have that table. i would like to split the string table by the ';', and store it to the new column. the final column shold be like this

 id   string   word_count
0    31672;0    2       
1    31965;0    2
2    0;78464    2
3      51462    1
4    31931;0    2

如果有人知道如何使用python会很好。

it would be nice if someone knows how to do it with python.

推荐答案

选项1

使用 str.split + str.len -

df['word_count'] = df['string'].str.split(';').str.len()
df

     string  word_count
id                     
0   31672;0           2
1   31965;0           2
2   0;78464           2
3     51462           1
4   31931;0           2

选项2

带有 str.count -的聪明（高效，节省空间的解决方案）

Option 2
The clever (efficient, less space consuming) solution with str.count -

df['word_count'] = df['string'].str.count(';') + 1
df

     string  word_count
id                     
0   31672;0           2
1   31965;0           2
2   0;78464           2
3     51462           1
4   31931;0           2

注意-这样即使是空字符串，也要将字数设为1（在这种情况下，请坚持使用选项1）。

Caveat - this would ascribe a word count of 1 even for an empty string (in which case, stick with option 1).

如果希望每个单词都占据一个新列，有一种使用列出的快速简单的方法，将拆分加载到新的数据框中，并使用<$ c将新的数据框与原始数据连接起来$ c> concat -

If you want each word occupying a new column, there's a quick and simple way using tolist, loading the splits into a new dataframe, and concatenating the new dataframe with the original using concat -

v = pd.DataFrame(df['string'].str.split(';').tolist())\
        .rename(columns=lambda x: x + 1)\
        .add_prefix('string_')

pd.concat([df, v], 1)

     string  word_count string_1 string_2
id                                       
0   31672;0           2    31672        0
1   31965;0           2    31965        0
2   0;78464           2        0    78464
3     51462           1    51462     None
4   31931;0           2    31931        0

这篇关于分割一列字符串并计算 pandas 的单词数的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

分割一列字符串并计算 pandas 的单词数 [英] Splitting a column of strings and counting the number of words with pandas

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

分割一列字符串并计算 pandas 的单词数 [英] Splitting a column of strings and counting the number of words with pandas

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭