解析 bbcode 的最佳方式 [英] Best way to parse bbcode

查看:34
本文介绍了解析 bbcode 的最佳方式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想为 php 网站开发 bbcode 过滤器.(我正在使用 cakephp,它将是一个 bbcode 助手)我有一些要求.

I'd like to work on a bbcode filter for a php website. (I'm using cakephp, it would be a bbcode helper) I have some requirement.

Bbcodes 可以嵌套.所以这样的事情是有效的.

[block]  
    [block]  
    [/block]  
    [block]  
        [block]  
        [/block]  
    [/block]  
[/block]  

Bbcodes 可以有 0 个或多个参数.

例如:

[video: url="url", width="500", height="500"]Title[/video]

Bbcodes 可能有多种行为.

假设,[url]text[/url] 将被转换为 [url:url="text"]text[/url]或者视频 bbcode 将能够在 youtube、dailymotion 之间进行选择......

Let say, [url]text[/url] would be transformed to [url:url="text"]text[/url] or the video bbcode would be able to choose between youtube, dailymotion....

我认为它满足了我的大部分需求.我已经用正则表达式做了一些事情.但我最大的问题是匹配参数.事实上,我得到了嵌套的 bbcode 和带有 0 个参数的 bbcode.但是当我为参数添加正则表达式匹配时,它没有正确匹配嵌套的 bbcode.

I think it cover most of my needs. I alreay done something with regex. But my biggest problem was to match parameters. In fact, I got nested bbcode to work and bbcode with 0 parameters. But when I added a regex match for parameters it didn't match nested bbcode correctly.

"[($tag)(=.*)"](.*)[/1]"//这不是 .* 而是非-grey 匹配器

"[($tag)(=.*)"](.*)[/1]" // It wasn't .* but the non-gready matcher

我现在没有完整的正则表达式,但我有类似的东西(上图).

I don't have the complete regex with me right now, But I had something that looked like that(above).

那么有没有一种方法可以将 bbcode 与正则表达式或其他东西有效匹配.我唯一能想到的是使用访问者模式并以这种方式将我的文本与每个可能的标签分开,我可以对我的文本解析有更多的控制,我可能会验证我的文档,如果输入文本没有'没有有效的 bbcode.我可以在保存任何内容之前通知用户有错误.

So is there a way to match bbcode efficiently with regex or something else. The only thing I can think of is to use the visitor pattern and to split my text with each possible tags this way, I can have a bit more of control over my text parsing and I could probably validate my document so if the input text doesn't have valid bbcode. I could Notify the user with a error before saving anything.

我会使用 sablecc 来创建我的文本解析器.http://sablecc.org/

I would use sablecc to create my text parser. http://sablecc.org/

有什么更好的主意吗?或者任何可以导致高效灵活的 bbcode 解析器的东西?

Any better idea? or anything that could lead to a efficient flexible bbcode parser?

谢谢你,抱歉我的英语不好...

Thank you and sorry for my bad english...

推荐答案

peclPEAR BBCode 解析库.软件已经够难了,无需自己重新发明多年的工作.

There's both a pecl and PEAR BBCode parsing library. Software's hard enough without reinventing years of work on your own.

如果这些都不是一个选项,我会专注于将 BBCode 转换为有效的 XML 字符串,然后使用您最喜欢的 XML 解析例程.这里的想法非常粗略,但是

If neither of those are an option, I'd concentrate on turning the BBCode into a valid XML string, and then using your favorite XML parsing routine on that. Very very rough idea here, but

  1. 通过 htmlspecialchars 运行代码以转义任何需要转义的实体

  1. Run the code through htmlspecialchars to escape any entities that need escaping

将所有 [ 和 ] 字符转换为 <和 >分别

Transform all [ and ] characters into < and > respectively

不要忘记考虑 [tagname:

Don't forget to account for the colon in cases like [tagname:

如果 BBCode 嵌套正确,您应该设置为将此字符串传递到 XML 解析对象(SimpleXML、DOMDocument 等)

If the BBCode was nested properly, you should be all set to pass this string into an XML parsing object (SimpleXML, DOMDocument, etc.)

这篇关于解析 bbcode 的最佳方式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆