python/pandas:使用正则表达式删除字符串中方括号中的任何内容 [英] python/pandas: using regular expressions remove anything in square brackets in string

查看:130
本文介绍了python/pandas:使用正则表达式删除字符串中方括号中的任何内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用 Pandas 数据框尝试清理从 $12,34212342 之类的列,并使该列成为 int 或 float.虽然用 736[4] 找到了一行,所以我必须删除方括号内的所有内容,包括括号.

Working from a pandas dataframe trying to sanitize a column from something like $12,342 to 12342 and make the column into an int or float. Found one row though with 736[4] so I have to remove everything within the square brackets, brackets included.

到目前为止的代码

df2['Average Monthly Wage $'] = df2['Average Monthly Wage $'].str.replace('$','')
df2['Average Monthly Wage $'] = df2['Average Monthly Wage $'].str.replace(',','')
df2['Average Monthly Wage $'] = df2['Average Monthly Wage $'].str.replace(' ','')

下面的行是应该处理和删除方括号并有意使用它的内容的行.

The line below is what's supposed to handle and remove the square brackets and intentionally with it's content too.

df2['Average Monthly Wage $'] = df2['Average Monthly Wage $'].str.replace(r'[[^]]*\)','')

对于某些开发人员来说,这是微不足道的,但我并没有真正经常使用正则表达式来了解这一点,而且我还检查了并从上面制定的一个这样的堆栈示例中进行了检查.

To some dev's this is trivial but I've not really used regular expressions often enough to know this and I've also checked around and from one such stack example formulated the above.

推荐答案

我认为您需要:

df2 = pd.DataFrame({'Average Monthly Wage $': ['736[4]','7336[445]', '[4]345[5]']})
print (df2)
  Average Monthly Wage $
0                 736[4]
1              7336[445]
2              [4]345[5]

df2['Average Monthly Wage $'] = df2['Average Monthly Wage $'].str.replace(r'\[.*?\]','')
print (df2)
  Average Monthly Wage $
0                    736
1                   7336
2                    345

regex101.

这篇关于python/pandas:使用正则表达式删除字符串中方括号中的任何内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆