PHP URL验证 [英] PHP URL validation

查看:83
本文介绍了PHP URL验证的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道有无数个线程在问这个问题,但是我找不到能够帮助我解决这个问题的线程.

I know there are an infinite number of threads asking this question, but I have not been able to find one that can help me with this.

我基本上是在尝试解析大约10,000,000个URL的列表,确保它们按照以下条件有效,然后获取根域URL.此列表几乎包含了您可以想象的所有内容,包括诸如(以及预期的格式化URL)之类的内容:

I am basically trying to parse a list of around 10,000,000 URLs, make sure they are valid per the following criteria and then get the root domain URL. This list contains just about everything you can imagine, including stuff like (and the expected formatted url):

biy.ly/test [VALID] [return - bit.ly]
example.com/apples?test=1&id=4 [VALID] [return - example.com]
host101.wow404.apples.test.com/cert/blah [VALID] [return - test.com]
101.121.44.xxx [**inVALID**] [return false]
localhost/noway [**inVALID**] [return false]
www.awesome.com [VALID] [return - awesome.com]
i am so awesome [**inVALID**] [return false]
http://404.mynewsite.com/visits/page/view/1/ [VALID] [return - mynewsite.com]
www1.151.com/searchresults [VALID] [return - 151.com]

有人对此有任何建议吗?

Does any one have any suggestions for this?

推荐答案

^(?:https?://)?(?:[a-z0-9-]+\.)*((?:[a-z0-9-]+\.)[a-z]+)

说明

^                # start-of-line
(?:              # begin non-capturing group
  https?         #   "http" or "https"
  ://            #   "://"
)?               # end non-capturing group, make optional
(?:              # start non-capturing group
  [a-z0-9-]+\.   #   a name part (numbers, ASCII letters, dashes) & a dot
)*               # end non-capturing group, match as often as possible
(                # begin group 1 (this will be the domain name)
  (?:            #   start non-capturing group
    [a-z0-9-]+\. #     a name part, same as above
  )              #   end non-capturing group
  [a-z]+         #   the TLD
)                # end group 1 

http://rubular.com/r/g6s9bQpNnC

这篇关于PHP URL验证的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆