从网址获取域/主​​机名的最快方式是什么? [英] What is the fastest way to get the domain/host name from a URL?

查看:100
本文介绍了从网址获取域/主​​机名的最快方式是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要查看大量的字符串url,并从中提取域名。

I need to go through a large list of string url's and extract the domain name from them.

例如:

http://www.stackoverflow.com/questions 将提取 www.stackoverflow.com

我最初是使用新的URL(theUrlString).getHost(),但URL对象初始化为进程添加了很多时间,似乎不需要。

I originally was using new URL(theUrlString).getHost() but the URL object initialization adds a lot of time to the process and seems unneeded.

是否有更快的方法来提取可靠的主机名?

Is there a faster method to extract the host name that would be as reliable?

谢谢

编辑:我的错误,是的www。将包含在上面的域名示例中。此外,这些网址可能是http或https

My mistake, yes the www. would be included in domain name example above. Also, these urls may be http or https

推荐答案

如果您要处理 https etc,我建议你这样做:

If you want to handle https etc, I suggest you do something like this:

int slashslash = url.indexOf("//") + 2;
domain = url.substring(slashslash, url.indexOf('/', slashslash));

请注意,这包括 www (就像 URL.getHost()会做),这实际上是域名的一部分。

Note that this is includes the www part (just as URL.getHost() would do) which is actually part of the domain name.

编辑通过评论请求

这里有两种可能有帮助的方法:

Here are two methods that might be helpful:

/**
 * Will take a url such as http://www.stackoverflow.com and return www.stackoverflow.com
 * 
 * @param url
 * @return
 */
public static String getHost(String url){
    if(url == null || url.length() == 0)
        return "";

    int doubleslash = url.indexOf("//");
    if(doubleslash == -1)
        doubleslash = 0;
    else
        doubleslash += 2;

    int end = url.indexOf('/', doubleslash);
    end = end >= 0 ? end : url.length();

    int port = url.indexOf(':', doubleslash);
    end = (port > 0 && port < end) ? port : end;

    return url.substring(doubleslash, end);
}


/**  Based on : http://grepcode.com/file/repository.grepcode.com/java/ext/com.google.android/android/2.3.3_r1/android/webkit/CookieManager.java#CookieManager.getBaseDomain%28java.lang.String%29
 * Get the base domain for a given host or url. E.g. mail.google.com will return google.com
 * @param host 
 * @return 
 */
public static String getBaseDomain(String url) {
    String host = getHost(url);

    int startIndex = 0;
    int nextIndex = host.indexOf('.');
    int lastIndex = host.lastIndexOf('.');
    while (nextIndex < lastIndex) {
        startIndex = nextIndex + 1;
        nextIndex = host.indexOf('.', startIndex);
    }
    if (startIndex > 0) {
        return host.substring(startIndex);
    } else {
        return host;
    }
}

这篇关于从网址获取域/主​​机名的最快方式是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆