用另一个列的值的len()添加一个DataFrame列 [英] Adding a DataFrame column with len() of another column's values

查看:360
本文介绍了用另一个列的值的len()添加一个DataFrame列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在尝试获取另一列中的字符串值的字符计数列时遇到问题,但还没有弄清楚如何有效地做到这一点.

I'm having a problem trying to get a character count column of the string values in another column, and haven't figured out how to do it efficiently.

for index in range(len(df)):
    df['char_length'][index] = len(df['string'][index]))

这显然涉及首先创建一个null列,然后重写它,这在我的数据集上花费了很长时间.那么获得这样的东西最有效的方法是什么

This apparently involves first creating a column of nulls and then rewriting it, and it takes a really long time on my data set. So what's the most effective way of getting something like

'string'     'char_length'
abcd          4
abcde         5

我已经检查了很多,但是我无法弄清楚.

I've checked around quite a bit, but I haven't been able to figure it out.

推荐答案

Pandas具有矢量化字符串方法:str.len().要创建新列,您可以编写:

Pandas has a vectorised string method for this: str.len(). To create the new column you can write:

df['char_length'] = df['string'].str.len()

例如:

>>> df
  string
0   abcd
1  abcde

>>> df['char_length'] = df['string'].str.len()
>>> df
  string  char_length
0   abcd            4
1  abcde            5

这应该比使用Python for循环在DataFrame上循环要快得多.

This should be considerably faster than looping over the DataFrame with a Python for loop.

许多其他Python熟悉的字符串方法已引入Pandas.例如,lower(用于转换为小写字母),count用于计数特定子字符串的出现,replace用于将一个子字符串与另一个子字符串交换.

Many other familiar string methods from Python have been introduced to Pandas. For example, lower (for converting to lowercase letters), count for counting occurrences of a particular substring, and replace for swapping one substring with another.

这篇关于用另一个列的值的len()添加一个DataFrame列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆