用于URL验证的PHP正则表达式,filter_var的权限太高 [英] PHP regex for url validation, filter_var is too permisive

查看:134
本文介绍了用于URL验证的PHP正则表达式,filter_var的权限太高的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

首先让我们根据我的要求定义一个"URL".

First lets define a "URL" according to my requirements.

可选的唯一协议是http://https://

然后是一个强制性域名,例如stackoverflow.com

then a mandatory domain name like stackoverflow.com

然后选择其余的url组件(pathqueryhash,...)

then optionally the rest of url components (path, query, hash, ...)

根据我的要求供参考的有效和无效网址列表

For reference a list of valid and invalid url's according to my requirements

  • stackoverflow.com
  • stackoverflow.com/questions/ask
  • https://stackoverflow.com/questions/ask
  • http://www.amazon.com/Computers-Internet-Books/b/ref=bhp_bb0309A_comint2?ie=UTF8&node=5&pf_rd_m=ATVPDKIKX0DER&pf_rd_s=browse&pf_rd_r=0AH7GM29WF81Q72VPFDH&pf_rd_t=101&pf_rd_p=1273387142&pf_rd_i=283155
  • amazon.com/Computers-Internet-Books/b/ref=bhp_bb0309A_comint2?ie=UTF8&node=5&pf_rd_m=ATVPDKIKX0DER&pf_rd_s=browse&pf_rd_r=0AH7GM29WF81Q72VPFDH&pf_rd_t=101&pf_rd_p=1273387142&pf_rd_i=283155

http://test-site.com (filter_var拒绝!!!带有破折号的域名)

http://test-site.com (filter_var reject this!!! I have domain names with dashes )

  • http://www (php filter_var允许这样做,是的,我知道这是valid网址)
  • google
  • http://www..des (php filter_var允许这样做)
  • 域名中不允许包含任何字符的任何URL
  • http://www (php filter_var allow this, yes i know is a valid url)
  • google
  • http://www..des (php filter_var allow this)
  • Any url with not allowed characters in the domain name

为了完整性,这是我的php版本:5.3.2-1ubuntu4.2

For completeness here is my php version: 5.3.2-1ubuntu4.2

推荐答案

作为起点,您可以使用它用于JS ,但是转换起来很容易它可用于PHP preg_match.

As a starting point you can use this one, it's for JS, but it's easy to convert it to work for PHP preg_match.

/^(https?\://)?(www\.)?([a-z0-9]([a-z0-9]|(\-[a-z0-9]))*\.)+[a-z]+$/i

对于PHP,这应该可以使用:

For PHP should work this one:

$reg = '@^(https?\://)?(www\.)?([a-z0-9]([a-z0-9]|(\-[a-z0-9]))*\.)+[a-z]+$@i';

此正则表达式始终仅验证域部分,但是您可以对此进行处理或在第一个斜杠'/'(在"://"之后)拆分网址,并分别验证域部分和休息.

This regexp anyway validates only the domain part, but you can work on this or split the url at the 1st slash '/' (after "://") and validate separately the domain part and the rest.

BTW:它将同时验证"http://www.domain.com.com",但这不是错误,因为子域url可能类似于:"http://www.subdomain.domain.com"并且有效!而且几乎没有方法(或至少没有操作上简便的方法)使用正则表达式来验证正确的域tld ,因为您必须像这样将所有可能的域tld逐一内联到正则表达式中:

BTW: It would validate also "http://www.domain.com.com" but this is not an error because a subdomain url could be like: "http://www.subdomain.domain.com" and it's valid! And there is almost no way (or at least no operatively easy way) to validate for proper domain tld with a regex because you would have to write inline into your regex all possible domain tlds ONE BY ONE like this:

/^(https?\://)?(www\.)?([a-z0-9]([a-z0-9]|(\-[a-z0-9]))*\.)+(com|it|net|uk|de)$/i

(例如,最后一个仅验证以.com/.net/.de/.it/.co.uk结尾的域). 新TLD总是会出现,因此您必须调整正则表达式,每当一个新TLD出现时,这都是令人头疼的事!

(this last one for instance would validate only domain ending with .com/.net/.de/.it/.co.uk). New tlds always come out, so you would have to adjust you regex everytimne a new tld comes out, that's a pain in the neck!

这篇关于用于URL验证的PHP正则表达式,filter_var的权限太高的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆