从截断的 HTML 字符串中关闭标签 [英] Close tags from a truncated HTML string
问题描述
我继承了一个带有新闻部分的网站,该部分显示新闻文章的摘要.无论出于何种原因,创作者都决定显示文章的前 X 个字符就可以了.当然,这很快导致总结如下:
I have inherited a site with a news section that displays a summary of the news article. For whatever reason the creators decided that displaying the first X characters of the article would be fine. Of course this very quickly led to the summary being something like:
<p>What a mighty fine <a href="blah">da
<p>What a mighty fine and warm <a href="htt
<p>His name was "Emil&qu
这显然与页面搞砸了,尤其是当开始标签甚至没有关闭时.
Which quite obviously screws with the page, especially when the opening tags aren't even closed.
我所追求的是一种关闭字符串中所有打开标签的方法.我真的真的不想使用正则表达式来做到这一点.我确定有一个很好的解析器可以轻松完成,但我现在似乎找不到它.
What I'm after is a way to close all open tags within the string being taken. I really really don't want to use regex to do it. I'm sure there's a nice parser that can do it easily, I just can't seem to find it right now.
推荐答案
最好的办法可能是找到一种更好的算法来生成摘录,例如在截断之前运行 strip_tags.
The best thing is probably to find a better algorithm for generating the excerpt, for example by running strip_tags before the truncation.
您将如何以其他方式处理难以发现的编程错误,例如 <p>多棒又好用的 <a href="htt
或 <p>他的名字是"Emil&qu
?
How will you otherwise handle hard-to-find-programmatically errors such as <p>What a mighty fine and warm <a href="htt
or <p>His name was "Emil&qu
?
这篇关于从截断的 HTML 字符串中关闭标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!