正则表达式意外结束 [英] unexpected end of regular expression

查看:92
本文介绍了正则表达式意外结束的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我只想从路径中获取带有扩展名的文件名:

C:\\Users\\anandada\\workspace\\MyTestProject\\src\\OpenTest.c

以下声明,

fileName = re.match("[^\\]*.c$", fileName)

给出错误:

<块引用>

正则表达式意外结束

我使用的是 python 3.3.2

解决方案

您需要加倍 再次使用一个代替原始字符串:

fileName = re.match("[^\\\\]*.c$",fileName)

fileName = re.match(r"[^\\]*.c$",fileName)

否则首先是 Python,然后正则表达式编译器将解释那些反斜杠,导致 ] 被转义:

<预><代码>>>>打印([^\\]*.c$")'[^\]*.c$'

另见黑斜线瘟疫Python Regex HOWTO 的部分.

接下来,您需要注意re.match 定位到字符串的开头.您可能想在这里使用 re.search() 代替.请参阅match()对比 search() 部分:

<块引用>

match() 函数只检查 RE 是否在字符串的开头匹配,而 search() 将向前扫描字符串以查找匹配项.记住这种区别很重要.

您可能还想对 .c 部分中的 . 进行转义;. 匹配任何字符,所以 foobaric 也会匹配;i 将满足 . 模式.

re.match()re.search() 函数返回一个 匹配对象,而不是字符串的匹配部分.您必须明确提取该部分:

fileName = re.search(r'[^\\]*\.c$', fileName).group()

演示:

<预><代码>>>>进口重新>>>fileName = 'C:\\Users\\anandada\\workspace\\MyTestProject\\src\\OpenTest.c'>>>re.search(r'[^\\]*\.c$', 文件名).group()'OpenTest.c'

I want to get only the file name with extension from the path:

C:\\Users\\anandada\\workspace\\MyTestProject\\src\\OpenTest.c

The statement below,

fileName = re.match("[^\\]*.c$", fileName)

gives error:

unexpected end of regular expression

I am using python 3.3.2

解决方案

You need to double the doubled escapes again, or use a raw string instead:

fileName = re.match("[^\\\\]*.c$",fileName)

or

fileName = re.match(r"[^\\]*.c$",fileName)

otherwise first Python, then the regular expression compiler will interpret those backslashes, resulting in the ] being escaped:

>>> print("[^\\]*.c$")
'[^\]*.c$'

Also see the Blackslash Plague section of the Python Regex HOWTO.

Next, you need to be aware that re.match anchors to the start of the string. You'll probably want to use re.search() instead here. See the match() vs. search() section:

The match() function only checks if the RE matches at the beginning of the string while search() will scan forward through the string for a match. It’s important to keep this distinction in mind.

You may also want to escape the . in the .c part; . matches any character, so foobaric would also match; the i would satisfy the . pattern.

The re.match() and re.search() functions return a match object, not the matched part of the string. You'll have to extract that part explicitly:

fileName = re.search(r'[^\\]*\.c$', fileName).group()

Demo:

>>> import re
>>> fileName = 'C:\\Users\\anandada\\workspace\\MyTestProject\\src\\OpenTest.c'
>>> re.search(r'[^\\]*\.c$', fileName).group()
'OpenTest.c'

这篇关于正则表达式意外结束的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆