什么标记语言格式丰富的内容? [英] What markup language for richly formatted content?
问题描述
各种不同的标记语言的优点和缺点是什么:
- HTML li>
- Markdown
- < a b href =http://en.wikipedia.org/wiki/BBCode =noreferrer> BBCode 纺织品 MediaWiki标记
- 其他 $ b $ b
- 易于从一个来源解析为多种形式 - PDF,HTML,RTF
- 如果需要,内容将以可读的纯文本形式存储(通常比原始HTML更容易阅读)而不需要从HTML中提取
- 遵循特定的定义规则,其中HTML可能令人讨厌变量和非结构化
- 允许您强制一部分内容格式化在许多情况下比简单地允许完整的HTML更合适除了强制HTML的一个子集之外,还可以很容易清理输入和防止跨站点脚本问题等。$ / $>
- 将原始数据保存为抽象格式意味着在以后的日子,如果您想要将您的网站从HTML 4转换为XHTML,则只需更改解析代码即可。使用HTML格式的用户输入,你现在不得不把所有的HTML单独转换成XHTML,HTML Tidy所显示的并不总是一个简单的任务。同样,如果某个时候出现了新的标记语言,或者您需要移动到另一种格式(RTF,PDF,TeX),那么抽象的文本格式选项的受限子集使得这个任务变得更加简单。
- Easy to parse into multiple forms from one source - PDF, HTML, RTF
- Content is stored in readable plain text (usually much easier to read than raw HTML) if needed at some later date, rather than needing to extract from the HTML
- Follows specific defined rules where HTML can be annoying variable and unstructured
- Allows you to force a subset of content formatting that's more appropriate in many cases than simply allowing full HTML
- In addition to forcing a subset of HTML makes it easy to sanitize input and prevent cross site scripting problems etc.
- Keeping the "raw" data in an abstracted format means that at a later date, if you for instance wanted to convert your site from HTML 4 to XHTML, you only need to change the parsing code. With HTML formatted user input, you're stuck now having to convert all the HTML to XHTML individually, which as HTML Tidy shows, is not always a simple task. Similarly if a new markup language comes along at some point or you need to move to an alternative format (RTF, PDF, TeX) an abstracted restricted subset of text formatting options makes that a much simpler task.
Markdown,BBCode,Textil e,MediaWiki标记都是基本相同的一般概念,所以我只是把它分成两类:HTML和纯文本标记。
与HTML的交易是内容已经处于可呈现形式的网页内容。这太好了,节省了处理时间,而且是一种容易分析的语言。有几十个库,几乎所有的语言来处理HTML内容,转换为/从HTML到其他格式等。主要的缺点是,由于早期的网络日子宽松的标准,HTML可以是令人难以置信的变数,你可以从用户接受HTML时,始终依赖于正常的输入。正如指出的那样,对HTML进行整理或清理通常是非常困难的,尤其是因为它不能像XML那样遵循正常的标记规则(即不正确地关闭标记是常见的)。
纯文本标记
这个类别经常用于以下原因:
底线是用于什么用户输入。如果你打算保留数据,可能需要格式化等,那么使用谨慎的抽象格式来存储信息是有意义的。如果您因为某种原因需要手动处理原始数据,那么如果该格式容易被人读取,则可以获得奖励。如果您只是在网页上显示内容(或HTML报表等),并且您不需要转换或未来验证,那么将其存储在HTML中是合理的做法。
When you are developing a web-based application and you want to allow richly formatted text from the user you have to make a choice about how to allow that input. Many different markup languages have been created because it is arguably more difficult to sanitize HTML.
What are the advantages and disadvantages of the various different markup languages like:
Or to put it differently, what factors do you consider when choosing to use a particular markup language.
Markdown, BBCode, Textile, MediaWiki markup are all basically the same general concept, so I would really just lump this into two categories: HTML, and plain text markup.
HTML
The deal with HTML is the content is already in a "presentable" form for web content. That's great, saves processing time, and it's a readily parse-able language. There are dozens of libraries in pretty much any language to handle HTML content, convert to/from HTML to other formats, etc. The main downside is that because of the loose standards of the early web days, HTML can be incredibly variable and you can't always depend on sane input when accepting HTML from users. As pointed out, tidying or santizing HTML is often very difficult, especially because it fails to follow normal markup rules the way XML does (i.e. improperly closed tags are common).
Plain Text Markup
This category is frequently used for the following reasons:
Bottom line is what is the user input being used for. If you're planning to keep the data around and may need to shuffle formats etc. then it makes sense to use a careful abstract format to store the information. If you need to work with the raw data manually for any reason, then bonus points if that format is easily human-readable. If you're only displaying the content in a web page (or HTML doc for a report etc.) and you have no concerns about converting it or future-proofing it, then it's a reasonable practice to store it in HTML.
这篇关于什么标记语言格式丰富的内容?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!