使用正则表达式匹配但不包括html开始/结束标记 [英] using a regular expression to match up to but not including html start/end tags

查看:298
本文介绍了使用正则表达式匹配但不包括html开始/结束标记的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要创建一个正则表达式,它将匹配一个5位数字,一个

空格,然后是最新但不包括下一个结束html标记的任何内容。

这是一个例子:


< startTag> 55555任何文字< / aClosingTag>


我需要一个可以得到所有这些的正则表达式上面的html标签之间的文字

(html标签是随机的,我手头之前不知道它们)。比赛

字符串始终以至少5位开头。

I need to create a regular expression that will match a 5 digit number, a
space and then anything up to but not including the next closing html tag.
Here is an example:

<startTag>55555 any text</aClosingTag>

I need a Regex that will get all of the text between the html tags above
(the html tags are random and i do not know them before hand). The match
string always starts with at least 5 digits.

推荐答案

10月11日,6: 42 * am,Andy B < a_bo ... @ sbcglobal.netwrote:
On Oct 11, 6:42*am, "Andy B" <a_bo...@sbcglobal.netwrote:

我需要创建一个匹配5位数字的正则表达式,

空格,然后是任何东西,但不包括下一个关闭的html标签..

这是一个例子:


< startTag> 55555任何文字< / aClosingTag>


我需要一个正则表达式来获取上面的html标签之间的所有文本

(html标签是随机的,我不知道他们之前)。比赛

字符串始终以至少5位数开头。
I need to create a regular expression that will match a 5 digit number, a
space and then anything up to but not including the next closing html tag..
Here is an example:

<startTag>55555 any text</aClosingTag>

I need a Regex that will get all of the text between the html tags above
(the html tags are random and i do not know them before hand). The match
string always starts with at least 5 digits.



嗨Andy,

这个链接有一个很好的功能,可以检索

之间的文本标签:
http://www.4guysfromrolla.com/demos/ StripHTML1.asp


它还提供在线测试功能,当你看到

时输入:

< startTag> 55555页面上的任何文字< / aClosingTagin文本框,它

返回55555任何文字,大概是你想要的。


然后你可以在你的项目中使用相同的功能来获益。


希望这会有所帮助,


OnurGüzel

Hi Andy,
There''s a nice function on that link which retrieves the text between
the tags:
http://www.4guysfromrolla.com/demos/StripHTML1.asp

It also provides to test the function online as you see, when you
enter the line:
<startTag>55555 any text</aClosingTagin textbox on the page, it
returns "55555 any text", presumably what you want.

Then you can use the same function in your project to benefit.

Hope this helps,

Onur Güzel


2008年10月10日星期五23:42:10 -0400,Andy B < a _ ***** @ sbcglobal.net>

写道:
On Fri, 10 Oct 2008 23:42:10 -0400, "Andy B" <a_*****@sbcglobal.net>
wrote:

>我需要创建一个正则表达式,匹配一个5位数字,一个
空格,然后是任何东西,但不包括下一个关闭的html标签。
这是一个例子:

< startTag> 55555任何文字< ; / aClosingTag>
>I need to create a regular expression that will match a 5 digit number, a
space and then anything up to but not including the next closing html tag.
Here is an example:

<startTag>55555 any text</aClosingTag>



如果您只是对比赛感兴趣,请试试这个:


<(\ w +)> \\ \\ d {5}。*< / \1>


请注意上面的空格(按原样复制)。

If you are just interested in a match, try this:

<(\w+)>\d{5} .*</\1>

Note the space above (copy as is).


>我需要一个正则表达式,它将获取上面的html标签之间的所有文本
(html标签是随机的,我不知道它们之前)。匹配
字符串始终以至少5位数开头。
>I need a Regex that will get all of the text between the html tags above
(the html tags are random and i do not know them before hand). The match
string always starts with at least 5 digits.



以上似乎意味着您希望捕获与表达式相匹配的文本

。如果是这种情况,请尝试以下方法:


<(\ w +)>(\d {5}。*)< / \1>


第二组将包含你所追求的文字。

The above seems to imply you wish to capture the text that has matched
the expression. If this is the case, try this:

<(\w+)>(\d{5} .*)</\1>

Group two will contain the text you are after.


10月11日,1:22 * pm,kimiraikkonen< kimiraikkone .. 。@ gmail.comwrote:
On Oct 11, 1:22*pm, kimiraikkonen <kimiraikkone...@gmail.comwrote:

10月11日,6:42 * am,Andy B < a_bo ... @sbcglobal.netwrote:
On Oct 11, 6:42*am, "Andy B" <a_bo...@sbcglobal.netwrote:

我需要创建一个匹配5位数的正则表达式,

空格,然后是任何东西,但不包括下一个关闭的html标签。

这是一个例子:
I need to create a regular expression that will match a 5 digit number,a
space and then anything up to but not including the next closing html tag.
Here is an example:


< ; startTag> 55555任何文字< / aClosingTag>
<startTag>55555 any text</aClosingTag>


我需要一个正则表达式来获取上面的html标签之间的所有文本

(html标签是随机的我手头之前不认识他们。比赛

字符串始终以至少5位数开头。
I need a Regex that will get all of the text between the html tags above
(the html tags are random and i do not know them before hand). The match
string always starts with at least 5 digits.



嗨Andy,

这个链接上有一个很好的功能,可以检索

之间的文本标签: http://www.4guysfromrolla.com/demos/StripHTML1.asp


它还提供在线测试功能,当你看到

时输入:

< ; startTag> 55555页面上的任何文字< / aClosingTagin文本框,它

返回55555任何文字,大概是你想要的。


然后你可以在你的项目中使用相同的功能来获益。


希望这会有所帮助,


OnurGüzel


Hi Andy,
There''s a nice function on that link which retrieves the text between
the tags:http://www.4guysfromrolla.com/demos/StripHTML1.asp

It also provides to test the function online as you see, when you
enter the line:
<startTag>55555 any text</aClosingTagin textbox on the page, it
returns "55555 any text", presumably what you want.

Then you can use the same function in your project to benefit.

Hope this helps,

Onur Güzel



Andy,

我修改了一下代码,然后粘贴代码来获取

HTML标签之间的文字:


在示例中,strToSearch是您帖子中的那个:


''-------------- ---------------------------

Dim strToSearch As String

''你的HTML行包括它的标签

strToSearch ="< startTag> 55555任何文字< / aClosingTag>"


''使用正确的模式初始化正则表达式类型

Dim objRegExp作为新的正则表达式("(&。(。| \ n)+?>")


''定义输出变量

Dim strOutput As String


''用空字符串替换所有HTML标记匹配

strOutput = objRegExp.Replace(strToSearch,"")


''替换所有<和& lt;和& gt;

strOutput =替换(strOutput,"<","& lt;")

strOutput =替换(strOutput,& ;>","& gt;")


''在MsgBox中显示结果

''返回" 5555任何文字"

MsgBox(strOutput.ToString)


objRegExp =没什么

''----------- -----------------------------------


希望它'更好,

OnurGüzel

Andy,
I revised the code a bit, and paste that code to get the text between
HTML tags:

In the sample, strToSearch is the one that''s in your post:

''-----------------------------------------
Dim strToSearch As String
'' Your HTML line includin its tag
strToSearch = "<startTag>55555 any text</aClosingTag>"

'' Initialize Regex type with proper pattern
Dim objRegExp As New Regex("<(.|\n)+?>")

'' Define output variable
Dim strOutput As String

''Replace all HTML tag matches with the empty string
strOutput = objRegExp.Replace(strToSearch, "")

''Replace all < and with &lt; and &gt;
strOutput = Replace(strOutput, "<", "&lt;")
strOutput = Replace(strOutput, ">", "&gt;")

''Show result in MsgBox
''Returns "5555 any text"
MsgBox(strOutput.ToString)

objRegExp = Nothing
''----------------------------------------------

Hope it''s better,

Onur Güzel


这篇关于使用正则表达式匹配但不包括html开始/结束标记的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆