在href标记内提取和替换url [英] Extracting and replacing url within href tag
问题描述
嗨
假设我有来自HTML文档的以下网址
< a href =" http://mydomain1.com"> ; Domain1< / a>
< a
href =" http://subdomain.domain.com/myfile.anyext"> http:// subdomain .domain.com / myfile.anyext< / a>
< a href =" http://subdomain.domain2.com/myfile.anyext"> Domain2< / a>
现在,我想在Href中搜索URL模式以及检查
,如果它包含特定域,例如domain2.com,如果是的话
则用以下网址替换。
" http://redirectUrl.com/http://subdomain.domain2.com/myfile .anyext
有谁可以解释这个?
谢谢你
-adnan
Hi
Suppose I have following URLs comming from an HTML document
<a href="http://mydomain1.com">Domain1</a>
<a
href="http://subdomain.domain.com/myfile.anyext">http://subdomain.domain.com/myfile.anyext</a>
<a href="http://subdomain.domain2.com/myfile.anyext">Domain2</a>
Now,what I want to search URL pattern within Href only as well as check
if it contains a particular domain ,for instance "domain2.com", if yes
then it replace with following URL.
"http://redirectUrl.com/http://subdomain.domain2.com/myfile.anyext"
can anyone shed light upon this?
Thankyou
-adnan
推荐答案
Adnan Siddiqi写道:
Adnan Siddiqi wrote:
嗨
假设我有以下网址通讯来自HTML文档
< a href =" http://mydomain1.com"> Domain1< / a>
< a
href =" http://subdomain.domain.com/myfile.anyext">http://subdomain.domain.com/myfile.anyext</a>
< a href =" http: //subdomain.domain2.com/myfile.anyext">Domain2</a>
现在,我想在Href中搜索网址格式以及检查
是否包含一个特定的域名,例如domain2.com,如果是的话
然后用以下的URL替换。
" http://redirectUrl.com/http://子域名。 domain2.com/myfile.anyext"
Hi
Suppose I have following URLs comming from an HTML document
<a href="http://mydomain1.com">Domain1</a>
<a
href="http://subdomain.domain.com/myfile.anyext">http://subdomain.domain.com/myfile.anyext</a>
<a href="http://subdomain.domain2.com/myfile.anyext">Domain2</a>
Now,what I want to search URL pattern within Href only as well as check
if it contains a particular domain ,for instance "domain2.com", if yes
then it replace with following URL.
"http://redirectUrl.com/http://subdomain.domain2.com/myfile.anyext"
< script type =" text / javascript">
function patchLinks(){
var len = document.links.length;
var lnk = null;
for(var i = 0; I< LEN; i ++){
lnk = document.links [i];
if(lnk.href.indexOf(''domain2.com'')!= -1){
lnk.href =''http://redirectUrl.com/''+ lnk.href;
}
}
}
window.onload = patchLinks;
< / script>
<script type="text/javascript">
function patchLinks() {
var len = document.links.length;
var lnk = null;
for (var i=0; i<len; i++) {
lnk = document.links[i];
if (lnk.href.indexOf(''domain2.com'') != -1) {
lnk.href = ''http://redirectUrl.com/'' + lnk.href;
}
}
}
window.onload = patchLinks;
</script>
< br>
VK写道:
VK wrote:
Adnan Siddiqi写道:
Adnan Siddiqi wrote:
现在,我想在Href中搜索URL模式以及检查<如果它包含特定的域名,例如domain2.com,如果是,则
然后用以下URL替换。
" http://redirectUrl.com/ http://subdomain.domain2.com/myfile.anyext"
Now,what I want to search URL pattern within Href only as well as check
if it contains a particular domain ,for instance "domain2.com", if yes
then it replace with following URL.
"http://redirectUrl.com/http://subdomain.domain2.com/myfile.anyext"
< script type =" text / javascript">
function patchLinks(){
lnk = document.links [i] ;
if(lnk.href.indexOf(''do main2.com'')!= -1){
lnk.href =''http://redirectUrl.com/''+ lnk.href;
}
}
}
window.onload = patchLinks;
< / script>
<script type="text/javascript">
function patchLinks() {
var len = document.links.length;
var lnk = null;
for (var i=0; i<len; i++) {
lnk = document.links[i];
if (lnk.href.indexOf(''domain2.com'') != -1) {
lnk.href = ''http://redirectUrl.com/'' + lnk.href;
}
}
}
window.onload = patchLinks;
</script>
在编码之前请/想/。
PointedEars
-
荷马:我改变了世界。现在我知道成为上帝的感觉了!
Marge:你想要火鸡肠或火腿吗?
荷马:你送我*两个*,各种一个。
(圣诞老人的小助手[狗]和雪球[猫]逃跑:))
Please /think/ before you code.
PointedEars
--
Homer: I have changed the world. Now I know how it feels to be God!
Marge: Do you want turkey sausage or ham?
Homer: Thou shalt send me *two*, one of each kind.
(Santa''s Little Helper [dog] and Snowball [cat] run away :))
Adnan Siddiqi写道:
Adnan Siddiqi wrote:
假设我有以下来自HTML文档的URL
< a href =" http://mydomain1.com"> Domain1< / a>
< a
href =" http://subdomain.domain.com/myfile.anyext"> http://subdomain.domain.com/myfile.anyext< / a>
< a href =" http://subdomain.domain2.com/myfile.anyext"> Domain2< / a>
现在,我是什么想要在Href中搜索URL模式以及检查
是否包含特定域,例如domain2.com,如果是,则
然后用以下URL替换。
" http://redirectUrl.com/http://subdomain.domain2.com/myfile.anyext"
这不是有效的URL / URI。请参阅RFC3986及更低版本。
有人可以解释这个吗?
Suppose I have following URLs comming from an HTML document
<a href="http://mydomain1.com">Domain1</a>
<a
href="http://subdomain.domain.com/myfile.anyext">http://subdomain.domain.com/myfile.anyext</a>
<a href="http://subdomain.domain2.com/myfile.anyext">Domain2</a>
Now,what I want to search URL pattern within Href only as well as check
if it contains a particular domain ,for instance "domain2.com", if yes
then it replace with following URL.
"http://redirectUrl.com/http://subdomain.domain2.com/myfile.anyext"
This is not a valid URL/URI. See RFC3986 and below.
can anyone shed light upon this?
首先,你想要做这个服务器端,而不是客户端。
但是,当时使用的语言可能是ECMAScript实现
。这里给出的解决方案的唯一区别是
你将需要以不同的方式确定什么是链接,并且你需要解析源代码(除非你可以使用
现有的标记解析器实现。
其次,使用正则表达式。
。 ...
< html>
< head>
...
< meta http -equiv = QUOT;内容脚本的类型" content =" text / javascript">
< script type =" text / javascript">
var _global = this;
/ **
*修补指向特定域的链接,以便将其目标
* URL附加到另一个URL。
*
* @param sDomains:string
*与要重定向的域的链接,由< tt> |< / tt>分隔。
* @param sRedirectBase:string
*重定向的基本URI(前缀)。
* /
函数patchLinks(sDomains ,sRedirectBase)
{
/ **
*根据查询组件尝试很难转义字符串
* RFC3986中的规范。
*
* @partof
* http://pointedears.de/scripts/string.js
* @param s:string
* @return类型字符串
*< code> s< / code> ;如果逃脱,则转义或未转义
*< code> encodeURIComponent()< / code>或< code> escape()< / code>
*是不可能的。
* /
函数esc(s)< br $>
{
/ **
* @author
*(C)2003-2006 Thomas Lahn& lt ; ty ****** @ PointedEars.de& gt;
*分布在GNU GPL v2下。
* @partof
* http://pointedears.de/scripts/types.js
* @argument s
*要确定方法类型的字符串,即对象 for
* IE DOM方法,功能除此以外。该类型必须使用`typeof''运算符检索
*。
*
*注意与@link相反{# isMethod()},这个
*方法也可以返回< code> true< / code>如果值为
*< code> typeof< / code>操作数是< code> null< / code> ;;要
*确保操作数是方法参考,你必须
*&& (AND) - 使用方法引用标识符组合< code> isMethodType(...)< / code>
*表达式。
*
*使用此方法代替< code> isMethod()< / code>如果
*您想要避免警告,以防未定义属性
*,或财产错误
*无法阅读。
* @return
*< code> true< / code> if< code> s< / code>是一种方法类型,
*< code> false< / code>否则。
* @type boolean
* @see #isMethod()
* /
函数isMethodType( s)
{
return /\s*(function|object)\s*/.test(s);
}
return(isMethodType(typeof encodeURIComponent)
&& encodeURIComponent
?encodeURIComponent(s)
:( isMethodType(typeof escape)&& escape
?escape(s)
:s));
}
?* for(var links = document.links,i = links&& links.length; i--;)
{
?* var
link = links [i],
rx = new RegExp(
" ^(ht | f)tps?:\\ / \\ /([^。] + \\。)*("
+ sDomains.replace(/\ ./) g,\\。")
+")(\\ / |
First of all, you want to do this server-side, not client-side.
However, the language used then may be an ECMAScript implementation
as well. The only difference to the solution presented here is that
you will need to determine what is a link differently, and that you
have to parse the source code instead (unless you can make use of an
existing markup parser implementation).
Second, use Regular Expressions.
....
<html>
<head>
...
<meta http-equiv="Content-Script-Type" content="text/javascript">
<script type="text/javascript">
var _global = this;
/**
* Patches links referring to specific domains so that their target
* URL is appended to another URL.
*
* @param sDomains: string
* Links with domains to be redirected, delimited by <tt>|</tt>.
* @param sRedirectBase: string
* Base URI (prefix) for the redirection.
*/
function patchLinks(sDomains, sRedirectBase)
{
/**
* Tries hard to escape a string according to the query component
* specification in RFC3986.
*
* @partof
* http://pointedears.de/scripts/string.js
* @param s: string
* @return type string
* <code>s</code> escaped, or unescaped if escaping through
* <code>encodeURIComponent()</code> or <code>escape()</code>
* is not possible.
*/
function esc(s)
{
/**
* @author
* (C) 2003-2006 Thomas Lahn <ty******@PointedEars.de>
* Distributed under the GNU GPL v2.
* @partof
* http://pointedears.de/scripts/types.js
* @argument s
* String to be determined a method type, i.e. "object" for
* IE DOM methods, "function" otherwise. The type must have
* been retrieved with the `typeof'' operator.
*
* Note that in contrast to @link{#isMethod()}, this
* method may also return <code>true</code> if the value of
* the <code>typeof</code> operand is <code>null</code>; to be
* sure that the operand is a method reference, you have to
* && (AND)-combine the <code>isMethodType(...)</code>
* expression with the method reference identifier.
*
* Use this method instead of <code>isMethod()</code> if
* you want to avoid warnings in case the property to be
* tested is not defined, or errors in case the property
* cannot be read.
* @return
* <code>true</code> if <code>s</code> is a method type,
* <code>false</code> otherwise.
* @type boolean
* @see #isMethod()
*/
function isMethodType(s)
{
return /\s*(function|object)\s*/.test(s);
}
return (isMethodType(typeof encodeURIComponent)
&& encodeURIComponent
? encodeURIComponent(s)
: (isMethodType(typeof escape) && escape
? escape(s)
: s));
}
?*for (var links = document.links, i = links && links.length; i--;)
{
?* var
link = links[i],
rx = new RegExp(
"^(ht|f)tps?:\\/\\/([^.]+\\.)*("
+ sDomains.replace(/\./g, "\\.")
+ ")(\\/|
这篇关于在href标记内提取和替换url的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!