在href标记内提取和替换url [英] Extracting and replacing url within href tag

查看:51
本文介绍了在href标记内提取和替换url的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



假设我有来自HTML文档的以下网址


< a href =" http://mydomain1.com"> ; Domain1< / a>

< a

href =" http://subdomain.domain.com/myfile.anyext"> http:// subdomain .domain.com / myfile.anyext< / a>

< a href =" http://subdomain.domain2.com/myfile.anyext"> Domain2< / a>


现在,我想在Href中搜索URL模式以及检查

,如果它包含特定域,例如domain2.com,如果是的话

则用以下网址替换。


" http://redirectUrl.com/http://subdomain.domain2.com/myfile .anyext


有谁可以解释这个?


谢谢你


-adnan

Hi
Suppose I have following URLs comming from an HTML document

<a href="http://mydomain1.com">Domain1</a>
<a
href="http://subdomain.domain.com/myfile.anyext">http://subdomain.domain.com/myfile.anyext</a>
<a href="http://subdomain.domain2.com/myfile.anyext">Domain2</a>

Now,what I want to search URL pattern within Href only as well as check
if it contains a particular domain ,for instance "domain2.com", if yes
then it replace with following URL.

"http://redirectUrl.com/http://subdomain.domain2.com/myfile.anyext"

can anyone shed light upon this?

Thankyou

-adnan

推荐答案



Adnan Siddiqi写道:

Adnan Siddiqi wrote:

假设我有以下网址通讯来自HTML文档

< a href =" http://mydomain1.com"> Domain1< / a>
< a
href =" http://subdomain.domain.com/myfile.anyext">http://subdomain.domain.com/myfile.anyext</a>

< a href =" http: //subdomain.domain2.com/myfile.anyext">Domain2</a>

现在,我想在Href中搜索网址格式以及检查
是否包含一个特定的域名,例如domain2.com,如果是的话
然后用以下的URL替换。

" http://redirectUrl.com/http://子域名。 domain2.com/myfile.anyext"
Hi
Suppose I have following URLs comming from an HTML document

<a href="http://mydomain1.com">Domain1</a>
<a
href="http://subdomain.domain.com/myfile.anyext">http://subdomain.domain.com/myfile.anyext</a>
<a href="http://subdomain.domain2.com/myfile.anyext">Domain2</a>

Now,what I want to search URL pattern within Href only as well as check
if it contains a particular domain ,for instance "domain2.com", if yes
then it replace with following URL.

"http://redirectUrl.com/http://subdomain.domain2.com/myfile.anyext"




< script type =" text / javascript">

function patchLinks(){

var len = document.links.length;

var lnk = null;

for(var i = 0; I< LEN; i ++){

lnk = document.links [i];

if(lnk.href.indexOf(''domain2.com'')!= -1){

lnk.href =''http://redirectUrl.com/''+ lnk.href;

}

}

}


window.onload = patchLinks;

< / script>



<script type="text/javascript">
function patchLinks() {
var len = document.links.length;
var lnk = null;
for (var i=0; i<len; i++) {
lnk = document.links[i];
if (lnk.href.indexOf(''domain2.com'') != -1) {
lnk.href = ''http://redirectUrl.com/'' + lnk.href;
}
}
}

window.onload = patchLinks;
</script>

< br>

VK写道:
VK wrote:
Adnan Siddiqi写道:
Adnan Siddiqi wrote:
现在,我想在Href中搜索URL模式以及检查<如果它包含特定的域名,例如domain2.com,如果是,则
然后用以下URL替换。

" http://redirectUrl.com/ http://subdomain.domain2.com/myfile.anyext"
Now,what I want to search URL pattern within Href only as well as check
if it contains a particular domain ,for instance "domain2.com", if yes
then it replace with following URL.

"http://redirectUrl.com/http://subdomain.domain2.com/myfile.anyext"



< script type =" text / javascript">
function patchLinks(){ lnk = document.links [i] ;
if(lnk.href.indexOf(''do main2.com'')!= -1){
lnk.href =''http://redirectUrl.com/''+ lnk.href;
}
}
}

window.onload = patchLinks;
< / script>



<script type="text/javascript">
function patchLinks() {
var len = document.links.length;
var lnk = null;
for (var i=0; i<len; i++) {
lnk = document.links[i];
if (lnk.href.indexOf(''domain2.com'') != -1) {
lnk.href = ''http://redirectUrl.com/'' + lnk.href;
}
}
}

window.onload = patchLinks;
</script>




在编码之前请/想/。

PointedEars

-

荷马:我改变了世界。现在我知道成为上帝的感觉了!

Marge:你想要火鸡肠或火腿吗?

荷马:你送我*两个*,各种一个。

(圣诞老人的小助手[狗]和雪球[猫]逃跑:))



Please /think/ before you code.
PointedEars
--
Homer: I have changed the world. Now I know how it feels to be God!
Marge: Do you want turkey sausage or ham?
Homer: Thou shalt send me *two*, one of each kind.
(Santa''s Little Helper [dog] and Snowball [cat] run away :))


Adnan Siddiqi写道:
Adnan Siddiqi wrote:
假设我有以下来自HTML文档的URL

< a href =" http://mydomain1.com"> Domain1< / a>
< a

href =" http://subdomain.domain.com/myfile.anyext"> http://subdomain.domain.com/myfile.anyext< / a>

< a href =" http://subdomain.domain2.com/myfile.anyext"> Domain2< / a>

现在,我是什么想要在Href中搜索URL模式以及检查
是否包含特定域,例如domain2.com,如果是,则
然后用以下URL替换。

" http://redirectUrl.com/http://subdomain.domain2.com/myfile.anyext"


这不是有效的URL / URI。请参阅RFC3986及更低版本。

有人可以解释这个吗?
Suppose I have following URLs comming from an HTML document

<a href="http://mydomain1.com">Domain1</a>
<a
href="http://subdomain.domain.com/myfile.anyext">http://subdomain.domain.com/myfile.anyext</a>

<a href="http://subdomain.domain2.com/myfile.anyext">Domain2</a>

Now,what I want to search URL pattern within Href only as well as check
if it contains a particular domain ,for instance "domain2.com", if yes
then it replace with following URL.

"http://redirectUrl.com/http://subdomain.domain2.com/myfile.anyext"
This is not a valid URL/URI. See RFC3986 and below.
can anyone shed light upon this?




首先,你想要做这个服务器端,而不是客户端。


但是,当时使用的语言可能是ECMAScript实现

。这里给出的解决方案的唯一区别是

你将需要以不同的方式确定什么是链接,并且你需要解析源代码(除非你可以使用

现有的标记解析器实现。


其次,使用正则表达式。


。 ...

< html>

< head>

...

< meta http -equiv = QUOT;内容脚本的类型" content =" text / javascript">

< script type =" text / javascript">

var _global = this;


/ **

*修补指向特定域的链接,以便将其目标

* URL附加到另一个URL。

*

* @param sDomains:string

*与要重定向的域的链接,由< tt> |< / tt>分隔。

* @param sRedirectBase:string

*重定向的基本URI(前缀)。

* /

函数patchLinks(sDomains ,sRedirectBase)

{

/ **

*根据查询组件尝试很难转义字符串

* RFC3986中的规范。

*

* @partof

* http://pointedears.de/scripts/string.js

* @param s:string

* @return类型字符串

*< code> s< / code> ;如果逃脱,则转义或未转义

*< code> encodeURIComponent()< / code>或< code> escape()< / code>

*是不可能的。

* /

函数esc(s)< br $>
{

/ **

* @author

*(C)2003-2006 Thomas Lahn& lt ; ty ****** @ PointedEars.de& gt;

*分布在GNU GPL v2下。

* @partof

* http://pointedears.de/scripts/types.js

* @argument s

*要确定方法类型的字符串,即对象 for

* IE DOM方法,功能除此以外。该类型必须使用`typeof''运算符检索

*。

*

*注意与@link相反{# isMethod()},这个

*方法也可以返回< code> true< / code>如果值为

*< code> typeof< / code>操作数是< code> null< / code> ;;要

*确保操作数是方法参考,你必须

*&& (AND) - 使用方法引用标识符组合< code> isMethodType(...)< / code>

*表达式。

*

*使用此方法代替< code> isMethod()< / code>如果

*您想要避免警告,以防未定义属性

*,或财产错误

*无法阅读。

* @return

*< code> true< / code> if< code> s< / code>是一种方法类型,

*< code> false< / code>否则。

* @type boolean

* @see #isMethod()

* /

函数isMethodType( s)

{

return /\s*(function|object)\s*/.test(s);

}


return(isMethodType(typeof encodeURIComponent)

&& encodeURIComponent

?encodeURIComponent(s)

:( isMethodType(typeof escape)&& escape

?escape(s)

:s));

}


?* for(var links = document.links,i = links&& links.length; i--;)

{

?* var

link = links [i],

rx = new RegExp(

" ^(ht | f)tps?:\\ / \\ /([^。] + \\。)*("

+ sDomains.replace(/\ ./) g,\\。")

+")(\\ / |



First of all, you want to do this server-side, not client-side.

However, the language used then may be an ECMAScript implementation
as well. The only difference to the solution presented here is that
you will need to determine what is a link differently, and that you
have to parse the source code instead (unless you can make use of an
existing markup parser implementation).

Second, use Regular Expressions.

....
<html>
<head>
...
<meta http-equiv="Content-Script-Type" content="text/javascript">
<script type="text/javascript">
var _global = this;

/**
* Patches links referring to specific domains so that their target
* URL is appended to another URL.
*
* @param sDomains: string
* Links with domains to be redirected, delimited by <tt>|</tt>.
* @param sRedirectBase: string
* Base URI (prefix) for the redirection.
*/
function patchLinks(sDomains, sRedirectBase)
{
/**
* Tries hard to escape a string according to the query component
* specification in RFC3986.
*
* @partof
* http://pointedears.de/scripts/string.js
* @param s: string
* @return type string
* <code>s</code> escaped, or unescaped if escaping through
* <code>encodeURIComponent()</code> or <code>escape()</code>
* is not possible.
*/
function esc(s)
{
/**
* @author
* (C) 2003-2006 Thomas Lahn &lt;ty******@PointedEars.de&gt;
* Distributed under the GNU GPL v2.
* @partof
* http://pointedears.de/scripts/types.js
* @argument s
* String to be determined a method type, i.e. "object" for
* IE DOM methods, "function" otherwise. The type must have
* been retrieved with the `typeof'' operator.
*
* Note that in contrast to @link{#isMethod()}, this
* method may also return <code>true</code> if the value of
* the <code>typeof</code> operand is <code>null</code>; to be
* sure that the operand is a method reference, you have to
* && (AND)-combine the <code>isMethodType(...)</code>
* expression with the method reference identifier.
*
* Use this method instead of <code>isMethod()</code> if
* you want to avoid warnings in case the property to be
* tested is not defined, or errors in case the property
* cannot be read.
* @return
* <code>true</code> if <code>s</code> is a method type,
* <code>false</code> otherwise.
* @type boolean
* @see #isMethod()
*/
function isMethodType(s)
{
return /\s*(function|object)\s*/.test(s);
}

return (isMethodType(typeof encodeURIComponent)
&& encodeURIComponent
? encodeURIComponent(s)
: (isMethodType(typeof escape) && escape
? escape(s)
: s));
}

?*for (var links = document.links, i = links && links.length; i--;)
{
?* var
link = links[i],
rx = new RegExp(
"^(ht|f)tps?:\\/\\/([^.]+\\.)*("
+ sDomains.replace(/\./g, "\\.")
+ ")(\\/|


这篇关于在href标记内提取和替换url的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆