从网址获取域/主机名的最快方式是什么? [英] What is the fastest way to get the domain/host name from a URL?
问题描述
我需要查看大量的字符串url,并从中提取域名。
I need to go through a large list of string url's and extract the domain name from them.
例如:
http://www.stackoverflow.com/questions 将提取 www.stackoverflow.com
我最初是使用新的URL(theUrlString).getHost()
,但URL对象初始化为进程添加了很多时间,似乎不需要。
I originally was using new URL(theUrlString).getHost()
but the URL object initialization adds a lot of time to the process and seems unneeded.
是否有更快的方法来提取可靠的主机名?
Is there a faster method to extract the host name that would be as reliable?
谢谢
编辑:我的错误,是的www。将包含在上面的域名示例中。此外,这些网址可能是http或https
My mistake, yes the www. would be included in domain name example above. Also, these urls may be http or https
推荐答案
如果您要处理 https
etc,我建议你这样做:
If you want to handle https
etc, I suggest you do something like this:
int slashslash = url.indexOf("//") + 2;
domain = url.substring(slashslash, url.indexOf('/', slashslash));
请注意,这包括 www
(就像 URL.getHost()
会做),这实际上是域名的一部分。
Note that this is includes the www
part (just as URL.getHost()
would do) which is actually part of the domain name.
编辑通过评论请求
这里有两种可能有帮助的方法:
Here are two methods that might be helpful:
/**
* Will take a url such as http://www.stackoverflow.com and return www.stackoverflow.com
*
* @param url
* @return
*/
public static String getHost(String url){
if(url == null || url.length() == 0)
return "";
int doubleslash = url.indexOf("//");
if(doubleslash == -1)
doubleslash = 0;
else
doubleslash += 2;
int end = url.indexOf('/', doubleslash);
end = end >= 0 ? end : url.length();
int port = url.indexOf(':', doubleslash);
end = (port > 0 && port < end) ? port : end;
return url.substring(doubleslash, end);
}
/** Based on : http://grepcode.com/file/repository.grepcode.com/java/ext/com.google.android/android/2.3.3_r1/android/webkit/CookieManager.java#CookieManager.getBaseDomain%28java.lang.String%29
* Get the base domain for a given host or url. E.g. mail.google.com will return google.com
* @param host
* @return
*/
public static String getBaseDomain(String url) {
String host = getHost(url);
int startIndex = 0;
int nextIndex = host.indexOf('.');
int lastIndex = host.lastIndexOf('.');
while (nextIndex < lastIndex) {
startIndex = nextIndex + 1;
nextIndex = host.indexOf('.', startIndex);
}
if (startIndex > 0) {
return host.substring(startIndex);
} else {
return host;
}
}
这篇关于从网址获取域/主机名的最快方式是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!