Python正则表达式转义字符 [英] python regex escape characters

查看:240
本文介绍了Python正则表达式转义字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们有:

>>> str
'exit\r\ndrwxr-xr-x    2 root     root            0 Jan  1  2000 
\x1b[1;34mbin\x1b[0m\r\ndrwxr-xr-x    3 root     root           
0 Jan  1  2000 \x1b[1;34mlib\x1b[0m\r\ndrwxr-xr-x   10 root     
root            0 Jan  1  1970 \x1b[1;34mlocal\x1b[0m\r\ndrwxr-xr-x    
2 root     root            0 Jan  1  2000 \x1b[1;34msbin\x1b[0m\r\ndrwxr-xr-x    
5 root     root            0 Jan  1  2000 \x1b[1;34mshare\x1b[0m\r\n# exit\r\n'

>>> print str
exit
drwxr-xr-x    2 root     root            0 Jan  1  2000 bin
drwxr-xr-x    3 root     root            0 Jan  1  2000 lib
drwxr-xr-x   10 root     root            0 Jan  1  1970 local
drwxr-xr-x    2 root     root            0 Jan  1  2000 sbin
drwxr-xr-x    5 root     root            0 Jan  1  2000 share
# exit

我想使用regexp摆脱所有的'\xblah [0m废话。我已经尝试过

I want to get rid of all the '\xblah[0m' nonsense using regexp. I've tried

re.sub(str, r'(\x.*m)', '')

但这并没有解决问题。有想法吗?

But that hasn't done the trick. Any ideas?

推荐答案

您遇到了一些问题:


  • 您正在以错误的顺序将参数传递给re.sub。应该是:

  • You're passing arguments to re.sub in the wrong order wrong. It should be:

re.sub(regexp_pattern,替换,source_string)

re.sub(regexp_pattern, replacement, source_string)

字符串不不包含 \x。 \x1b是转义字符,并且是单个字符。

The string doesn't contain "\x". That "\x1b" is the escape character, and it's a single character.

正如interjay所指出的,您想要。*?而不是。*,因为否则它将匹配从第一个转义到最后一个 m的所有内容。

As interjay pointed out, you want ".*?" rather than ".*", because otherwise it will match everything from the first escape through the last "m".

对re.sub的正确调用是:

The correct call to re.sub is:

print re.sub('\x1b.*?m', '', s)

或者,您可以使用:

print re.sub('\x1b[^m]*m', '', s)

这篇关于Python正则表达式转义字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆