我可以对字母,破折号和下划线使用python正则表达式吗? [英] Can I use a python regex for letters, dashes and underscores?

查看:169
本文介绍了我可以对字母,破折号和下划线使用python正则表达式吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想处理地理名称,例如/new_york或/new-york等 并且由于纽约是django-slugify的纽约,所以即使带有下划线的名称看起来更好,我也应该使用带名称的名称,因为我可能想通过诸如django slugify之类的算法来自动创建URL.一种猜测是([A-Za-z]+)或简单地([\w-]+)都可以工作,但是为了安全起见,在这种情况下,我问您哪个正则表达式是最佳选择. 我已经有一个正则表达式来处理将数字连接到类的数字:

('/([0-9]*)',ById)#按ID获取并显示实体

现在我想要另一个正则表达式匹配名称,例如new_york,因此要求 /new_york由适当的处理程序处理.基本上,上面的正则表达式的取反将是字母或下划线的任意组合,也可能是破折号-因为名称是地理名称,看来我可以使用此正则表达式,但我认为它只能工作,因为优先考虑的是它只需要处理所有内容:

('/(.*)', ByName)#通过为我的相关位置进行自定义映射来处理/new_york实体,/sao_paulo实体等.

由于我有其他处理程序,并且我不想与正则表达式冲突,并且我还有其他请求处理程序,因此,您能推荐一下如何编写正则表达式的方法吗?

当表达式适合2个正则表达式时如何工作?哪个优先级更高?您能否告诉我更多信息,我应该如何学习如何为地理数据存储区编写正则表达式和可能的实现-作为实体或实例变量以及特殊问题(例如以不同语言使用不同名称的地理位置)德语中的德国称为Deutschland,所以我也想应用我可以使用gettext/djang.po文件进行的翻译.

解决方案

首场比赛获胜.

通常,您的URL在路径的其他部分会有所不同.例如,您可能有

/cities/(?P<city>[^/]+)
/users/(?P<user>[^/]+)

在很多情况下,[^/] +是一个很好的正则表达式,因为它将匹配/以外的任何其他内容,通常会避免使用/,因为它用于分隔路径元素.

我不认为仅根据字符(在您的情况下为字母或数字)来分隔URL是个好主意,但如果要这样做,请使用[-A-Za-z_]+(请注意,-"位于[]的开始,否则需要反斜杠.

避免使用\w,因为它也可以匹配数字.除非您想发疯并且仅将数字发送给一个处理程序,而将数字和字母+数字发送给其他地方,在这种情况下,请使用:

/(?P<id>\d+)
/(?P<city>[-\w]+)

依次

.

I want to handle geographic names i.e /new_york or /new-york etc and since new-york is django-slugify for New york then maybe I should use the slugifed names even if names with underscores look better since I may want to automate the URL creation via an algorithm such as django slugify. A guess is that ([A-Za-z]+) or simply ([\w-]+) can work but to be safe I ask you which regex is best choice in this case. I've already got a regex that handles number connecting numbers to a class:

('/([0-9]*)',ById)#fetches and displays an entity by id

Now I want another regex to match names e.g. new_york so that a request for /new_york gets handled by the appropriate handler. Basically the negation of the regex above would or any combination letters+underscore and maybe a dash - since the names are geographical and It seems I could use this regex but I believe it works only because of precedence it that it just takes everything:

('/(.*)', ByName)#Handle for instance /new_york entities, /sao_paulo entities etc by custom mapping for my relevant places.

Since I have other handlers and I don't want conflicting regexes and I have other request handlers, could you recommend how to formulate the regex?

How does it work when an expression suits 2 regexes? Which has higher precedence? Can you tell me more how I should learn to write regexes and possible implementations for the geographical datastore - as entities or instance variables and special problems such as geographic locations that have different names in different languages e.g. Germany in german is called Deutschland so I also want to apply translations that I can do with gettext / djang.po files.

解决方案

the first match wins.

usually your URLs will differ in other parts of the path. for example you might have

/cities/(?P<city>[^/]+)
/users/(?P<user>[^/]+)

and in many cases [^/]+ is a good regex because it will match anything except /, which you would normally avoid because it is used to separate path elements.

i don't think it's a good idea to separate URLs based solely on characters (in your case, letters or digits), but if you want to do that, use [-A-Za-z_]+ (note that the "-" goes at the start of the [], or it needs a backslash).

avoid \w because that can also match digits. unless you want to go really crazy and send digits only to one handler and letters+digits elsewhere, in which case use:

/(?P<id>\d+)
/(?P<city>[-\w]+)

in that order.

这篇关于我可以对字母,破折号和下划线使用python正则表达式吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆