PHP中的简单BBparser,可让您替换标签外的内容 [英] Simple BBparser in PHP that lets you replace content outside tags

查看:114
本文介绍了PHP中的简单BBparser,可让您替换标签外的内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试解析表示源代码的字符串,如下所示:

I'm trying to parse strings that represent source code, something like this:

[code lang="html"]
  <div>stuff</div>
[/code]
<div>stuff</div>

从我之前的20个问题中可以看出,我尝试使用PHP的regex函数来实现它,但是遇到了很多问题,尤其是当字符串很大时...

As you can see from my previous 20 questions, I tried to do it with PHP's regex functions, but ran into many problems, especially when the string is very big...

你们知道我可以用PHP编写的BB解析器类而不是正则表达式吗?

Do you guys know a BB parser class written in PHP that I can use for this, instead of regexes?

我需要做的是:

  • 能够使用html实体转换[code]标记内的所有内容
  • 仅能对[code]标记之外的内容运行某种过滤器(我的回调函数)
  • be able to convert all content from within [code] tags with html entities
  • be able to run some kind of a filter (a callback function of mine) only on content outside of the [code] tags

谢谢

我最终使用了这个:

  • 将所有<pre><code>标记转换为[pre]和[code]:

  • convert all <pre> and <code> tags to [pre] and [code]:

str_replace(array('<pre>', '</pre>', '<code>', '</code>'), array('[pre]', '[/pre]', '[code]', '[/code]'), $content);

  • 从[code] .. [/code]和[pre] ... [/pre]之间获取内容,并进行html实体转换

  • get contents from between [code]..[/code] and [pre]...[/pre] and do the html entity conversion

    preg_replace_callback('/(.?)\[(pre|code)\b(.*?)(?:(\/))?\](?:(.+?)\[\/\2\])?(.?)/s', 'self::specialchars', $content);
    

    (我从wordpress shortcode函数中窃取了这种模式:)

    (i stole this pattern from wordpress shortcode functions :)

    将实体转换后的内容存储在一个临时数组变量中,并用唯一的ID替换$content中的内容

    store the entity converted content in a temporary array variable, and replace the one from $content with a unique ID

    我现在可以在$content上安全地运行我的过滤器,因为其中没有代码,只有ID(此过滤器在整个文本上执行strip_tags,并将http://blabla.com之类的内容转换为链接)

    I can now safely run my filter on $content, because there's no code in it, just the ID (this filter does a strip_tags on the entire text and converts stuff like http://blabla.com to links)

    $content中的唯一ID替换为数组变量中已转换的代码块

    replace the unique IDs from $content with the converted code blocks from the array variable

    您认为还可以吗?

    推荐答案

    你们知道我可以用PHP编写的BB解析器类而不是正则表达式吗?

    Do you guys know a BB parser class written in PHP that I can use for this, instead of regexes?

    BBCode PECL扩展名,但是您需要进行编译.

    There's the BBCode PECL extension, but you'd need to compile it.

    还有 PEAR的HTML_BBCodeParser ,尽管我不能保证它的有效性是.

    There's also PEAR's HTML_BBCodeParser, though I can't vouch for how effective it is.

    在其他地方也有一些,但是我认为它们都很僵化.

    There are also a few elsewhere, but I think they're all pretty rigid.

    我不相信其中的 会满足您对标记内容的回调的要求(然后@webarto完全是 正确之处在于HTMLPurifier是处理内容时要使用的正确工具).您可能必须在这里写下自己的内容.

    I don't believe that either of those do what you're looking for, with regard to having a callback for tag contents (and then @webarto is totally correct in that HTMLPurifier is the right tool to use when processing the contents). You might have to write your own here. I've previously written about my experiences doing the same that you might find helpful.

    这篇关于PHP中的简单BBparser,可让您替换标签外的内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆