如何使用模块re从数据框的列中删除特殊字符? [英] How to remove special characers from a column of dataframe using module re?
问题描述
嘿,我看到了该链接,但是他们没有在任何地方使用re
模块,这就是我在此处发布的原因.希望您理解并删除重复项.
Hey I have seen that link but nowhere there they have used re
module that's why I have posted here. Hope you understand and remove the duplicate.
这是链接.我想使用re
模块.
表格:
A B C D
1 Q! W@ 2
2 1$ E% 3
3 S2# D! 4
在这里,我要删除column
B
和C
中的特殊字符.我已经使用过.transform()
,但是如果可能的话,我想使用re
进行操作,但是出现错误.
here I want to remove the special characters from column
B
and C
. I have used .transform()
but I want to do it using re
if possible but I am getting errors.
输出:
A B C D E F
1 Q! W@ 2 Q W
2 1$ E% 3 1 E
3 S2# D! 4 S2 D
我的代码:
df['E'] = df['B'].str.translate(None, ",!.; -@!%^&*)(")
只有知道什么是特殊字符,它才有效.
It's working only if I know what are the special characters.
但是我想使用re
这是最好的方法.
But I want to use re
which would be the best way.
import re
#re.sub(r'\W+', '', your_string)
df['E'] = re.sub(r'\W+', '', df['B'].str)
我在这里遇到错误:
TypeError: expected string or buffer
所以我应该如何传递值以获得正确的输出.
So how should I pass the value to get the correct output.
推荐答案
As this answer shows, you can use map()
with a lambda
function that will assemble and return any expression you like:
df['E'] = df['B'].map(lambda x: re.sub(r'\W+', '', x))
lambda
仅定义匿名函数.您可以将它们保留为匿名状态,或像其他任何对象一样将它们分配给引用. my_function = lambda x: x.my_method(3)
等同于def my_function(x): return x.my_method(3)
.
lambda
simply defines anonymous functions. You can leave them anonymous, or assign them to a reference like any other object. my_function = lambda x: x.my_method(3)
is equivalent to def my_function(x): return x.my_method(3)
.
这篇关于如何使用模块re从数据框的列中删除特殊字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!