如果不完整,请删除HTML实体 [英] Remove HTML Entity if Incomplete

查看:50
本文介绍了如果不完整,请删除HTML实体的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我遇到一个问题,我最多显示从数据库中提取的一个字符串的400个字符,但是该字符串必须包含HTML实体.

I have an issue where I have displayed up to 400 characters of a string that is pulled from the database, however, this string is required to contain HTML Entities.

一个偶然的机会,客户端创建了一个字符串,使第400个字符恰好位于结束的P标签中间,从而杀死了该标签,并导致后面的其他代码错误.

By chance, the client has created the string to have the 400th character to sit right in the middle of a closing P tag, thus killing the tag, resulting in other errors for code after it.

我希望完全删除该结束的P标签,因为我在末尾附加了一个"...阅读更多"链接,如果将其附加到现有段落的话看起来会更干净.

I would prefer this closing P tag to be removed entirely as I have a "...read more" link attached to the end which would look cleaner if attached to the existing paragraph.

涵盖所有HTML实体问题的最佳方法是什么?是否有PHP函数可以自动关闭/删除任何错误的HTML标签?我不需要编码答案,只需一个方向即可.

What would be the best approach for this to cover all HTML Entity issues? Is there a PHP function that will automatically close off/remove any erroneous HTML tags? I don't need a coded answer, just a direction will help greatly.

谢谢.

推荐答案

这是使用DOMDocument的一种简单方法,虽然它并不完美,但可能会引起人们的兴趣:

Here's a simple way you can do it with DOMDocument, its not perfect but it may be of interest:

<?php 
function html_tidy($src){
    libxml_use_internal_errors(true);
    $x = new DOMDocument;
    $x->loadHTML('<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />'.$src);
    $x->formatOutput = true;
    $ret = preg_replace('~<(?:!DOCTYPE|/?(?:html|body|head))[^>]*>\s*~i', '', $x->saveHTML());
    return trim(str_replace('<meta http-equiv="Content-Type" content="text/html;charset=utf-8">','',$ret));
}

$brokenHTML[] = "<p><span>This is some broken html</spa";
$brokenHTML[] = "<poken html</spa";
$brokenHTML[] = "<p><span>This is some broken html</spa</p>";

/*
<p><span>This is some broken html</span></p>
<poken html></poken>
<p><span>This is some broken html</span></p>
*/
foreach($brokenHTML as $test){
    echo html_tidy($test);
}

?> 

尽管要注意 Mike'Pomax'Kamermans 的评论.

这篇关于如果不完整,请删除HTML实体的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆