如何从 pandas 数据框中的字符串项中删除数字 [英] How to remove numbers from string terms in a pandas dataframe
问题描述
我有一个类似于以下内容的数据框:
I have a data frame similar to the one below:
Name Volume Value
May21 23 21321
James 12 12311
Adi22 11 4435
Hello 34 32454
Girl90 56 654654
我希望输出采用以下格式:
I want the output to be in the format:
Name Volume Value
May 23 21321
James 12 12311
Adi 11 4435
Hello 34 32454
Girl 56 654654
要从名称"列中删除所有数字.
Want to remove all the numbers from the Name column.
我最近来的是在单元格级别使用以下代码进行操作:
Closest I have come is doing it at a cell level with the following code:
result = ''.join([i for i in df['Name'][1] if not i.isdigit()])
任何想法如何在系列/数据框架级别上以更好的方式进行操作.
Any idea how to do it in a better way at the series/dataframe level.
推荐答案
You can apply str.replace to the Name
column in combination with regular expressions:
import pandas as pd
# Example DataFrame
df = pd.DataFrame.from_dict({'Name' : ['May21', 'James', 'Adi22', 'Hello', 'Girl90'],
'Volume': [23, 12, 11, 34, 56],
'Value' : [21321, 12311, 4435, 32454, 654654]})
df['Name'] = df['Name'].str.replace('\d+', '')
print(df)
输出:
Name Value Volume
0 May 21321 23
1 James 12311 12
2 Adi 4435 11
3 Hello 32454 34
4 Girl 654654 56
在正则表达式中,\d
代表任何数字",而+
代表一个或多个".
In the regular expression \d
stands for "any digit" and +
stands for "one or more".
因此,str.replace('\d+', '')
的意思是:将字符串中所有出现的数字全部替换为空".
Thus, str.replace('\d+', '')
means: "Replace all occurring digits in the strings with nothing".
这篇关于如何从 pandas 数据框中的字符串项中删除数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!