HTML Tidy在JavaScript字符串文字中的脚本标记上失败 [英] HTML Tidy fails on script tag in JavaScript string literal
问题描述
我在PHP中使用HTML Tidy,并且由于JavaScript字符串文字中的< script>
标签而产生了意想不到的结果。以下是一个示例输入:
< html>
< script>
var t ='< script><'+'/ script>';
< / script>
< / html>
HTML Tidy的输出:
< HTML>
< script>
//<![CDATA [
var t ='< script><'+'/ script>';
< \ / script>
< \ / html>
//]]>
< / script>
< / html>
它解释 如何防止在PHP中发生此错误? 在玩了一段时间之后,我发现可以使用注释 清理完成后: 我的猜测是,由于清理算法会查看代码并检测字符串 所以我做了第二个假设,在算法中没有if语句来确定 I'm using HTML Tidy in PHP and it's producing unexpected results because of a HTML Tidy's output: It's interpreting How do I prevent this error from occurring in PHP? After playing around with it a bit I discovered that one can use comment After clean-up: My guess is that as the clean-up algorithm looks through the codes and detects the string So I made a second assumption that there isn't an if-statement in the algorithm to determine if a 这篇关于HTML Tidy在JavaScript字符串文字中的脚本标记上失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!< /脚本>< / HTML>
作为脚本的一部分。然后,它会添加另一个< / script>< / html>
来关闭打开的标签。我在HTML Tidy的一个在线版本()上试过这个,它是生成相同的错误。
//'< \ / script>'
来混淆算法防止此错误发生的方法:
< html>
< script>
var t ='< script><'+'/ script>'; //'< \ / script>'
< / script>
< / html>
<!DOCTYPE html PUBLIC - // W3C // DTD HTML 3.2 // EN>
< html>
< head>
< script>
var t ='< script><'+'/ script>'; //'< \ / script>'
< / script>
< title>< / title>
< / head>
< body>
< / body>
< / html>
< script>
两次,它立即寻找< / script>
。和separting <
与 /脚本>
使第二< /脚本>
未被发现,这就是为什么它决定添加另一< /脚本>
在代码和以某种方式的端部还与antoher关闭它< / HTML>
。 (可怜的设计确实!)
< ; / scirpt>
在评论中,我是对的!将另一个字符串< \ / script>
作为javascript注释,确实使算法认为有两个< / script>
共计。<script>
tag in a JavaScript string literal. Here's a sample input:<html>
<script>
var t='<script><'+'/script>';
</script>
</html>
<html>
<script>
//<![CDATA[
var t='<script><'+'/script>';
<\/script>
<\/html>
//]]>
</script>
</html>
</script></html>
as part of the script. Then, it adds another </script></html>
to close the open tags. I tried this on an online version of HTML Tidy (http://www.dirtymarkup.com/) and it's producing the same error.//'<\/script>'
to confuse the algorithm in a way to prevent this bug from occurring:<html>
<script>
var t='<script><'+'/script>'; //'<\/script>'
</script>
</html>
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html>
<head>
<script>
var t='<script><'+'/script>'; //'<\/script>'
</script>
<title></title>
</head>
<body>
</body>
</html>
<script>
twice, it looks for </script>
immediately. And separting <
with /script>
makes the second </script>
goes undetected, which is why it decided to add another </script>
at the end of the codes and somehow also closed it with antoher </html>
. (Poor design indeed!) </scirpt>
is in a comment, and I was right! Having another string <\/script>
as a javascript comment indeed makes the algorithm to think that there are two </script>
in total.