将自动换行应用于 html 内容,不包括 html 属性 [英] Apply wordwrap to html content, excluding html attributes
问题描述
我不习惯正则表达式,所以这对我来说似乎很容易但很棘手.
I'm not used to regular expressions so this might seem easy while tricky for me.
基本上,我将自动换行应用于包含经典 html 标签的内容:、...
Basically, i'm applying wordwrap to content, that contains classic html tags : , ...
$text = wordwrap($text, $cutLength, " ", $wordCut);
$text = nl2br(bbcode_parser($text));
return $text;
如您所见,我的问题非常简单:我只想将 wordwrap() 应用于我的内容,不包括 html 属性中的内容: href 、 src ...
As you can see, my problem is pretty simple : all I want is to apply wordwrap() to my content, excluding what could be in html attributes : href , src ...
有人可以帮我吗?非常感谢!
Could someone help me out ? Thanks a lot !
推荐答案
当然不应该使用正则表达式进行 html 解析,但这应该分开
你想要的内容.我对 php 的了解有限,所以这只是说明程序.
You shouldn't use regex for html parsing of course, but this should separate out
content should you want to. I have limited knowledge of php so this just illustrates procedure.
$tags =
' <
(?:
/?\w+\s*/?
| \w+\s+ (?:".*?"|\'.*?\'|[^>]*?)+\s*/?
| !(?:DOCTYPE.*?|--.*?--)
)>
';
$scripts =
' <
(?:
(?:script|style) \s*
| (?:script|style) \s+ (?:".*?"|\'.*?\'|[^>]*?)+\s*
)>
.*?
</(?:script|style)\s*>
';
$regex = / ($scripts | $tags) | ((?:(?!$tags).)+) /xsg;
替换字符串是 Group1 分类到您的返回值自动换行函数(传递内容,Group2 字符串)所以像:replacement = \1.textwrap( \2 )
在 textwrap 内部,您决定如何处理内容.
The replacement string is Group1 catted to the return value of your
word wrap function (which is passed the content, Group2 string)
so something like: replacement = \1 . textwrap( \2 )
Inside of textwrap you decide what to do with the content.
在 Perl 中测试(顺便说一句,它非常慢并且为了清晰起见被淡化了):
Tested in Perl (btw its very slow and watered down for clarity):
use strict;
use warnings;
my $tags =
' <
(?:
/?\w+\s*/?
| \w+\s+ (?:".*?"|\'.*?\'|[^>]*?)+\s*/?
| !(?:DOCTYPE.*?|--.*?--)
)>
';
my $scripts =
' <
(?:
(?:script|style) \s*
| (?:script|style) \s+ (?:".*?"|\'.*?\'|[^>]*?)+\s*
)>
.*?
</(?:script|style)\s*>
';
my $html = join '', <DATA>;
while ( $html =~ / ($scripts | $tags) | ((?:(?!$tags).)+) /xsg ) {
if (defined $2 && $2 !~ /^\s+$/) {
print $2,"\n";
}
}
这篇关于将自动换行应用于 html 内容,不包括 html 属性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!