从主机名中提取域名 [英] Extract domain name from a host name
问题描述
给定 - > www.yahoo.co.jp
返回 - >是否有程序方式从给定的主机名中找到域名? yahoo.co.jp
有效但非常慢的方法是:
split on。并从左边移除1个组,当返回有效的SOA记录时,使用dnspython
加入并查询SOA记录,认为一个域
是否有更简单/更快的方法来做到这一点,而不使用正则表达式?
没有什么简单的定义,哪个域名是任何特定的主机名。
您当前遍历树的方法,直到看到 SOA
记录为止实际上是最正确的。
技术上,你在做的是找到一个区域切割,在绝大多数的情况下,这将对应于该域被从其TLD委派。
任何依赖纯文本解析主机名而不引用DNS的方法注定要失败。
或者,使用集中维护的以代理为中心的域名列表 http://publicsuffix.org/ ,但请注意,这些列表可能不完整和/或过期。
另见这个问题所有这一切都已经
Is there a programatic way to find the domain name from a given hostname?
given -> www.yahoo.co.jp return -> yahoo.co.jp
The approach that works but is very slow is:
split on "." and remove 1 group from the left, join and query an SOA record using dnspython when a valid SOA record is returned, consider that a domain
Is there a cleaner/faster way to do this without using regexps?
There's no trivial definition of which "domain name" is the parent of any particular "host name".
Your current method of traversing up the tree until you see an SOA
record is actually the most correct.
Technically, what you're doing there is finding a "zone cut", and in the vast majority of cases that will correspond to the point at which the domain was delegated from its TLD.
Any method that relies on mere text parsing of the host name without reference to the DNS is doomed to failure.
Alternatively, make use of the centrally maintained lists of delegation-centric domains from http://publicsuffix.org/, but beware that these lists can be incomplete and/or out of date.
See also this question where all of this has been gone over before...
这篇关于从主机名中提取域名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!