从 URL 获取子域 [英] Get the subdomain from a URL

查看:34
本文介绍了从 URL 获取子域的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

从 URL 获取子域一开始听起来很容易.

Getting the subdomain from a URL sounds easy at first.

http://www.domain.example

扫描第一个句点,然后返回http://"之后的任何内容......

Scan for the first period then return whatever came after the "http://" ...

那你记得

http://super.duper.domain.example

哦.然后你想,好吧,找到最后一个时期,回过头来得到之前的一切!

Oh. So then you think, okay, find the last period, go back a word and get everything before!

那你记得

http://super.duper.domain.co.uk

然后你又回到了第一个.除了存储所有 TLD 的列表之外,还有人有什么好主意吗?

And you're back to square one. Anyone have any great ideas besides storing a list of all TLDs?

推荐答案

除此之外,任何人都有任何伟大的想法存储所有 TLD 的列表?

Anyone have any great ideas besides storing a list of all TLDs?

不,因为每个 TLD 在什么算作子域、二级域等方面有所不同.

No, because each TLD differs on what counts as a subdomain, second level domain, etc.

请记住,有顶级域、二级域和子域.从技术上讲,除 TLD 之外的所有内容都是子域.

Keep in mind that there are top level domains, second level domains, and subdomains. Technically speaking, everything except the TLD is a subdomain.

在 domain.com.uk 示例中,domain"是子域,com"是二级域,uk"是 TLD.

In the domain.com.uk example, "domain" is a subdomain, "com" is a second level domain, and "uk" is the TLD.

所以这个问题比乍一看要复杂得多,这取决于每个 TLD 的管理方式.您需要一个包含所有 TLD 的数据库,包括它们的特定分区,以及什么算作二级域和子域.不过,TLD 并不多,因此该列表可以合理管理,但收集所有这些信息并非易事.可能已经有这样的列表可用了.

So the question remains more complex than at first blush, and it depends on how each TLD is managed. You'll need a database of all the TLDs that include their particular partitioning, and what counts as a second level domain and a subdomain. There aren't too many TLDs, though, so the list is reasonably manageable, but collecting all that information isn't trivial. There may already be such a list available.

看起来 http://publicsuffix.org/ 就是这样一个列表——所有常见的后缀(.com、.co.uk 等)在适合搜索的列表中.解析它仍然不容易,但至少您不必维护列表.

Looks like http://publicsuffix.org/ is one such list—all the common suffixes (.com, .co.uk, etc) in a list suitable for searching. It still won't be easy to parse it, but at least you don't have to maintain the list.

公共后缀"是指网民可直接注册名称.一些公开的例子后缀是.com"、.co.uk"和pvt.k12.wy.us".公共后缀列表是所有已知公众的列表后缀.

A "public suffix" is one under which Internet users can directly register names. Some examples of public suffixes are ".com", ".co.uk" and "pvt.k12.wy.us". The Public Suffix List is a list of all known public suffixes.

公共后缀列表是一个Mozilla 基金会的倡议.它可用于任何软件,但最初是创建的满足浏览器的需求制造商.它允许浏览器,例如:

The Public Suffix List is an initiative of the Mozilla Foundation. It is available for use in any software, but was originally created to meet the needs of browser manufacturers. It allows browsers to, for example:

  • 避免设置破坏隐私的超级cookie"高级域名后缀
  • 突出显示用户中域名最重要的部分界面
  • 按站点准确排序历史条目

查看列表,你可以看到这不是一个小问题.我认为列表是完成此任务的唯一正确方法...

Looking through the list, you can see it's not a trivial problem. I think a list is the only correct way to accomplish this...

这篇关于从 URL 获取子域的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆