Python URL 匹配(正则表达式) [英] Python URL matching (Regex)

查看:87
本文介绍了Python URL 匹配(正则表达式)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经尝试匹配以下 URL 几个小时,但似乎无法弄清楚,我很确定这并不难:

I've tried to match a the below URL for a couple of hours and can't seem to figure it out and Im quite sure its not that difficult:

网址可以是:

/course/lesson-one/

也可以是:

/course/lesson-one/chapter-one/

我所拥有的是与第二个 URL 匹配的以下内容:

What I have is the following which matches the second URL:

/course/([a-zA-Z]+[-a-zA-Z]*)/([a-zA-Z]+[-a-zA-Z]*)/

我想要的是第二部分是可选的,但我无法弄清楚我得到的最接近的是以下内容:

What I want is for the second part to be optional but I can't figure it out the closest I got was the following:

/course/([a-zA-Z]+[-a-zA-Z]*)/*([a-zA-Z]+[-a-zA-Z]*)/

但是上面出于某种原因遗漏了单词的最后一个字母,例如,如果 URL 是

But the above for some reason leaves out the last letter of the word for example if the URL is

/course/computers/

我最终得到字符串 'computer'

I end up with the string 'computer'

推荐答案

如果需要可选部件,您使用 ?.

You use ? if you need optional parts.

/course/([a-zA-Z][-a-zA-Z]*)/([a-zA-Z][-a-zA-Z]*/)?
#                                                 ^

(注意 [a-zA-Z]+[-a-zA-Z]* 等价于 [a-zA-Z][-a-zA-Z]*.)

(Note that [a-zA-Z]+[-a-zA-Z]* is equivalent to [a-zA-Z][-a-zA-Z]*.)

使用额外的分组 (?:...) 从匹配中排除 /,同时允许多个元素同时可选:

Use an additional grouping (?:…) to exclude the / from the match, while allowing multiple elements to be optional at once:

/course/([a-zA-Z][-a-zA-Z]*)/(?:([a-zA-Z][-a-zA-Z]*)/)?
#                            ~~~                     ~^

<小时>

您的第二个正则表达式吞下了最后一个字符,因为:


Your 2nd regex swallows the last character, because:

  /course/([a-zA-Z]+[-a-zA-Z]*)/*([a-zA-Z]+[-a-zA-Z]*)/
          ^^^^^^^^^^^^^^^^^^^^^  ~~~~~~~~~~~~~~~~~~~~~
        this matches 'computer'  and this matches the 's'.

由于 +,这个正则表达式中的第二组需要匹配一些长度为 1 或更多的字母,所以 's' 必须属于那里.

The second group in this regex required to match some alphabets with length 1 or more due to the +, so the 's' must belong there.

这篇关于Python URL 匹配(正则表达式)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆