删除MS Word“HTML”使用PHP [英] Remove MS Word "HTML" using PHP
问题描述
可能存在重复:
什么是最好的免费方式来清理Word HTML?
PHP清理粘贴的Microsoft输入
我允许客户在富文本编辑器中输入笔记,并且最近才升级到ckEditor 3x,默认情况下剥离MS字类,样式和注释用户粘贴到编辑器对象中)。所以向前迈进,我都准备好了。
我最近需要清理5年的笔记,其中一些笔记有MS词汇生成的HTML嵌入。我需要遍历这些文本并清理它。
我不需要去掉所有的span标签,只有微软写的标签。
>我试过使用HTMLCleaner,但它并没有删除MS生成的HTML。 http://word2cleanhtml.com 完全符合我的要求,但是开发人员目前不提供API供公众使用(截至2012年7月9日)。
过去几周我一直在寻找这样一个班级,并没有多少运气。您有没有找到想要分享的有用课程?
.org /> http://htmlpurifier.org/
这可以做你想做的。
Possible Duplicate:
What is the best free way to clean up Word HTML?
PHP to clean-up pasted Microsoft input
I allow clients to enter notes in a rich text editor, and have only recently upgraded to ckEditor 3x, which strips MS word classes, styles, and comments by default (when users paste into the editor object). So moving forward I'm all set.
I've recently had a need to clean up 5 years worth of notes some of which have MS word generated HTML embedded. I need to loop through this body of text and clean it.
I do not need to strip out all span tags, only those identified as written by Microsoft.
I've tried using HTMLCleaner, but it is not removing the MS generated HTML. http://word2cleanhtml.com does exactly what I want, however the developers are currently not offering the API for public use (as of July 9, 2012).
I've looked for such a class off and on for the last few weeks and am not having much luck. Have any of you found a useful class you'd like to share?
This will do what you want.
这篇关于删除MS Word“HTML”使用PHP的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!