如何检查所有的html标签都使用Regex关闭 [英] How to check that all html tags are closed with Regex
本文介绍了如何检查所有的html标签都使用Regex关闭的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我的意思是每个<
应该有一个适当的>
。没有任何<
或>
的字符串也应该有效。
任何想法?
解决方案
有一次,我创建了一个JavaScript BB代码解析器,不正确地关闭标签。同样的概念也适用于HTML(以及任何其他依赖树的标记语言)。
- 定义变量:
var string =; var lastIndex = 0; var stack = []; var parsedString =///还有一些
- 遍历字符串,直到
<
匹配,使用string.indexOf(<,lastIndex)
。 - 选择标签名称,然后搜索关闭
>
(使用RE:/ [^<] +?/
)。将lastIndex
设置为此>
加上1的索引。 - 将这个值(tagName)添加到数组中(让我们定义这个数组:
var stack = [];
) 。 - 如果开始标记是
stack
的最后一个元素,则从最后一个元素开始穿过堆栈并返回。 ,使用 - 如果开始标记不是数组的最后一个元素:
- 如果您的标记很重要,那么persist找到开始标记(
< / div>
)应关闭任何< div>
,即使您必须扔掉9001< span>
声明)。 - 在遍历数组时,检查遇到的标记的状态:Are这些重要元素? (
< strong>
不如< div>
>重要)。 - 如果您的结账标记为
< / em>,则您会遇到重要的标记(例如
< div>
- 如果您的标记很重要,那么persist找到开始标记(
- Define variables:
var string = ""; var lastIndex = 0; var stack = []; var parsedString = ""///And some more
- Loop through the string, until a
<
is matched, usingstring.indexOf("<", lastIndex)
. - Select the tag name, and search for the closing
>
(using an RE:/[^<]+?/
). SetlastIndex
to the index of this>
, plus 1. - Add this value (tagName) to an array (let's define this array:
var stack = [];
). - If a closing tag is encountered, walk through the stack, from the last element and back.
- If the start tag is the last element of
stack
, usestack.pop()
. Continue at 1. - If the start tag isn't the last element of the array:
- If your tag is important, persist to find the opening tag (
</div>
should close any<div>
, even if you have to throw away 9001<span>
declarations). - While you walk through the array, check the status of the encountered tags: Are these "important" elements? (
<strong>
is less important than<div>
, for example). - If you encounter an important tag (such as
<div>
), while your closing tag was a</em>
, ignore the closing tag and go back to 1.
- If your tag is important, persist to find the opening tag (
stack.pop()
。 ,忽略结束标记并返回1。 <1>当1的计算结果为
false
(no <找到)时,将剩余的字符串添加到结果 resultString + = string.substring(lastIndex,string.length);
。
按照以下步骤进行解析一个字符串。
I mean that each <
should have an appropriate >
. A string without any <
or >
should be valid too.
Any idea?
解决方案 Once, I've created a JavaScript BB-code parser, which also dealt with incorrectly closed tags. The same concept also applies to HTML (and any other markup language which rely on a tree).
When 1 evaluates to false
(no <
found), add the remaining string to the result resultString += string.substring(lastIndex, string.length);
.
After following these steps, you've parsed a string.
这篇关于如何检查所有的html标签都使用Regex关闭的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文