使用Matcher提取URL域名 [英] Using Matcher to extract URL domain name
问题描述
static String AdrPattern="http://www.([^&]+)\\.com\\.*";
static Pattern WebUrlPattern = Pattern.compile (AdrPattern);
static Matcher WebUrlMatcher;
WebUrlMatcher = WebUrlPattern.matcher ("keyword");
if(WebUrlMatcher.matches())
String extractedPath = WebUrlMatcher.group (1);
考虑到上述代码,我的目标是从 URL 中提取域名并忽略其余部分.但麻烦的是,首先,如果 URL 有更深的路径,它不会忽略它,其次,它不适用于所有带有 .com
扩展名的 URL.
Considering above codes, My aim is to extract the domain name from the URL and dismiss the rest. But the trouble is that, first of all, if the URL has deeper path, it will not ignore it and second, it does not work for all URL with .com
extension.
例如,如果 URL 为 http://www.lego.com/en-us/technic/?domainredir=technic.lego
,则结果将不是 lego
但lego.com/en-us/technic/?domainredir=technic.lego
.
For example, if the URL is http://www.lego.com/en-us/technic/?domainredir=technic.lego
, the result will not be lego
but lego.com/en-us/technic/?domainredir=technic.lego
.
推荐答案
使用
static String AdrPattern="http://www\\.([^&]+)\\.com.*";
^^ ^
您转义了最后一个点,它被视为文字,matches
无法匹配整个字符串.此外,第一个点必须被转义.
You escaped the final dot, and it was treated as a literal, and matches
could not match the entire string. Also, the first dot must be escaped.
此外,为了使正则表达式更加严格,您可以将 [^&]+
替换为 [^/&]
.
Also, to make the regex a bit more strict, you can replace the [^&]+
with [^/&]
.
更新:
static String AdrPattern="http://www\\.([^/&]+)\\.com/([^/]+)/([^/]+)/([^/]+).*";
static Pattern WebUrlPattern = Pattern.compile (AdrPattern);
static Matcher WebUrlMatcher = WebUrlPattern.matcher("http://www.lego.com/en-us/technic/?domainredir=technic.lego");
if(WebUrlMatcher.matches()) {
String extractedPath = WebUrlMatcher.group(1);
String extractedPart1 = WebUrlMatcher.group(2);
String extractedPart2 = WebUrlMatcher.group(3);
String extractedPart3 = WebUrlMatcher.group(4);
}
或者,使用 \G
:
static String AdrPattern="(?:http://www\\.([^/&]+)\\.com/|(?!^)\\G)/?([^/]+)";
static String AdrPattern="http://www\\.([^/&]+)\\.com/([^/]+)/([^/]+)/([^/]+)";
static Pattern WebUrlPattern = Pattern.compile (AdrPattern);
static Matcher WebUrlMatcher = WebUrlPattern.matcher("http://www.lego.com/en-us/technic/?domainredir=technic.lego");
int cnt = 0;
while(WebUrlMatcher.find()) {
if (cnt == 0) {
String extractedPath = WebUrlMatcher.group(1);
String extractedPart = WebUrlMatcher.group(2);
cnt = cnt + 1;
}
else {
String extractedPart = WebUrlMatcher.group(2);
}
}
这篇关于使用Matcher提取URL域名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!