如何在C#将HTML转换为BB码 [英] How convert html to BBcode in C#

查看:110
本文介绍了如何在C#将HTML转换为BB码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要的 HTML 文本 bbcodes 转换。在哪里可以找到我应该怎么做呢?例如,我转换链接:

I need to convert html text into bbcodes. Where i can find how should i do this? For example, I convert links:

 regex = new Regex("<a href=\"(.+?)\">(.+?)</a>");
 htmlCode = regex.Replace(htmlCode, "[URL]$1[/URL]");



我如何可以转换所有的HTML标签bbcodes(和替换为空未bb上的代码,标签 p

推荐答案

有关的部分的HTML标签,你可以做一个简单的与string.replace 的BBCode在许多方面只是一个1:1,标签为标签映射,例如< b> < / b> 映射到 [b] [/ b] 分别所以这是很容易与刚刚完成的:

For some HTML tags, you can just do a simple string.Replace. BBCode is in many ways just a 1:1, tag-for-tag mapping, for example <b> and </b> mapping to [B] and [/B] respectively. So that's easily accomplished with just:

html.Replace("<b>", "[b]").Replace("</b>", "[/b]")

如果这是真的死了简单的HTML,你不介意做这种标签由标签的性能影响和代码丑陋,去了,但是跨站点脚本漏洞的提防,如果您打算在网页上的某个地方显示生成的BBCode;这是隔靴搔痒好消毒

If it's really dead-simple HTML, and you don't mind the performance impact and code ugliness of doing this tag-by-tag, go for it. But beware of cross-site scripting vulnerabilities, if you plan to display the resulting BBCode on a web page somewhere; this is nowhere near good enough for sanitization.

不过,甚至不打扰试图用正则表达式消毒HTML和做自动更换所有的标签。在< IMG> 标记,例如,看起来与HTML BBCode的完全不同。在HTML中它是< IMG SRC =.../> (斜杠是可选的)和BB代码是 [IMG] .. [/ IMG] 。与正则表达式这样做是......嗯,我们只能说次优的。

But don't even bother trying to use regular expressions to sanitize the HTML and do automatic replacement of all tags. The <img> tag, for instance, looks completely different in HTML vs. BBCode. In HTML it's <img src="..."/> (trailing slash is optional) and in BBCode it's [IMG]...[/IMG]. Doing this with regex is... well, let's just say sub-optimal.

正则表达式是专为普通的语言,HTML是不是一个正规的语言,这是一个上下文无关语言。考虑使用一个实际的HTML解析器,而不是像 HTML敏捷性包。然后,你可以下降DOM树,白名单你想要的元素,并将它们映射到BB代码或其他任何你喜欢的。

Regular expressions are designed for regular languages, and HTML is not a regular language, it's a context-free language. Consider using an actual HTML parser instead like the HTML Agility Pack. Then you can descend the DOM tree, whitelist the elements you want, and map them to BBCode or anything else however you like.

这篇关于如何在C#将HTML转换为BB码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆