Python正则表达式转义字符 [英] python regex escape characters
问题描述
我们有:
>>> str
'exit\r\ndrwxr-xr-x 2 root root 0 Jan 1 2000
\x1b[1;34mbin\x1b[0m\r\ndrwxr-xr-x 3 root root
0 Jan 1 2000 \x1b[1;34mlib\x1b[0m\r\ndrwxr-xr-x 10 root
root 0 Jan 1 1970 \x1b[1;34mlocal\x1b[0m\r\ndrwxr-xr-x
2 root root 0 Jan 1 2000 \x1b[1;34msbin\x1b[0m\r\ndrwxr-xr-x
5 root root 0 Jan 1 2000 \x1b[1;34mshare\x1b[0m\r\n# exit\r\n'
>>> print str
exit
drwxr-xr-x 2 root root 0 Jan 1 2000 bin
drwxr-xr-x 3 root root 0 Jan 1 2000 lib
drwxr-xr-x 10 root root 0 Jan 1 1970 local
drwxr-xr-x 2 root root 0 Jan 1 2000 sbin
drwxr-xr-x 5 root root 0 Jan 1 2000 share
# exit
我想使用regexp摆脱所有的'\xblah [0m废话。我已经尝试过
I want to get rid of all the '\xblah[0m' nonsense using regexp. I've tried
re.sub(str, r'(\x.*m)', '')
但这并没有解决问题。有想法吗?
But that hasn't done the trick. Any ideas?
推荐答案
您遇到了一些问题:
-
您正在以错误的顺序将参数传递给re.sub。应该是:
You're passing arguments to re.sub in the wrong order wrong. It should be:
re.sub(regexp_pattern,替换,source_string)
re.sub(regexp_pattern, replacement, source_string)
字符串不不包含 \x。 \x1b是转义字符,并且是单个字符。
The string doesn't contain "\x". That "\x1b" is the escape character, and it's a single character.
正如interjay所指出的,您想要。*?而不是。*,因为否则它将匹配从第一个转义到最后一个 m的所有内容。
As interjay pointed out, you want ".*?" rather than ".*", because otherwise it will match everything from the first escape through the last "m".
对re.sub的正确调用是:
The correct call to re.sub is:
print re.sub('\x1b.*?m', '', s)
或者,您可以使用:
print re.sub('\x1b[^m]*m', '', s)
这篇关于Python正则表达式转义字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!