有没有办法来剥去FCKEditor的所有不必要的MS Word格式 [英] is there a Way to strip all Unnecessary MS Word Formatting from FCKEditor

查看:205
本文介绍了有没有办法来剥去FCKEditor的所有不必要的MS Word格式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经安装的FCKeditor和MS Word中粘贴时,增加了很多不必要的格式。我想保持一定之类的东西粗体,斜体,bulltes等等。我已经在网上搜索,并与去掉一切的甚至是我想继续像粗体和斜体的东西解决方案上来。有没有办法脱衣只是不必要的字格式?

I have installed fckeditor and when pasting from MS Word it adds alot of unnecessary formatting. I want to keep certain things like bold, italics, bulltes and so forth. I have searched the web and came up with solutions that strips everything away even the stuff that i wanted to keep like bold and italics. Is there a way to strip just the unnecessary word formatting?

推荐答案

下面是一个解决方案,我用从富文本编辑器擦洗传入HTML ...它写在VB.NET,我没有足够的时间转换为C# ,但它是pretty简单:

Here's a solution I use to scrub incoming HTML from rich text editors... it's written in VB.NET and I don't have time to convert to C#, but it's pretty straightforward:

 Public Shared Function CleanHtml(ByVal html As String) As String
     '' Cleans all manner of evils from the rich text editors in IE, Firefox, Word, and Excel
     '' Only returns acceptable HTML, and converts line breaks to <br />
     '' Acceptable HTML includes HTML-encoded entities.
     html = html.Replace("&" & "nbsp;", " ").Trim() ' concat here due to SO formatting
     '' Does this have HTML tags?
     If html.IndexOf("<") >= 0 Then
         '' Make all tags lowercase
         html = RegEx.Replace(html, "<[^>]+>", AddressOf LowerTag)
         '' Filter out anything except allowed tags
         '' Problem: this strips attributes, including href from a
         '' http://stackoverflow.com/questions/307013/how-do-i-filter-all-html-tags-except-a-certain-whitelist
         Dim AcceptableTags      As String   = "i|b|u|sup|sub|ol|ul|li|br|h2|h3|h4|h5|span|div|p|a|img|blockquote"
         Dim WhiteListPattern    As String   = "</?(?(?=" & AcceptableTags & ")notag|[a-zA-Z0-9]+)(?:\s[a-zA-Z0-9\-]+=?(?:([""']?).*?\1?)?)*\s*/?>"
         html = Regex.Replace(html, WhiteListPattern, "", RegExOptions.Compiled)
         '' Make all BR/br tags look the same, and trim them of whitespace before/after
         html = RegEx.Replace(html, "\s*<br[^>]*>\s*", "<br />", RegExOptions.Compiled)
     End If
     '' No CRs
     html = html.Replace(controlChars.CR, "")
     '' Convert remaining LFs to line breaks
     html = html.Replace(controlChars.LF, "<br />")
     '' Trim BRs at the end of any string, and spaces on either side
     Return RegEx.Replace(html, "(<br />)+$", "", RegExOptions.Compiled).Trim()
 End Function

 Public Shared Function LowerTag(m As Match) As String
   Return m.ToString().ToLower()
 End Function

在你的情况,你要修改AcceptableTags批复HTML标签的列表 - 在code仍将去除所有无用的属性(和,不幸的是,有用的像HREF和SRC,希望那些对你并不重要)。

In your case, you'll want to modify the list of "approved" HTML tags in "AcceptableTags"--the code will still strip all the useless attributes (and, unfortunately, the useful ones like HREF and SRC, hopefully those aren't important to you).

当然,这需要与服务器通信。如果你不希望出现这种情况,你需要某种形式的清理按钮,添加到调用JavaScript与编辑器的当前文本混乱的工具栏。不幸的是,粘贴,是不是可以被困自动清理标记的事件,和清洗后,每的OnChange将使一个不可用编辑器(由于更改标记更改文本光标位置)。

Of course, this requires a trip to the server. If you don't want that, you'll need to add some sort of "clean up" button to the toolbar that calls JavaScript to mess with the editor's current text. Unfortunately, "pasting" is not an event that can be trapped to clean up the markup automatically, and cleaning after every OnChange would make for an unusable editor (since changing the markup changes the text cursor position).

这篇关于有没有办法来剥去FCKEditor的所有不必要的MS Word格式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆