递归/子例程正则表达式以匹配CSS媒体查询 [英] Recursive/subroutine regex to match CSS media queries

查看：113 发布时间：2020/7/21 22:14:38 php css regex recursion pcre

本文介绍了递归/子例程正则表达式以匹配CSS媒体查询的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在寻找一个正则表达式(在PHP PCRE中)，该正则表达式可以可靠地匹配媒体查询及其内容，包括媒体查询主体为空的情况有些奇怪.源文本可能是:

I'm looking for a regular expression (in PHP PCRE) that can match media queries and their contents reliably, including the somewhat odd case where a media query body is empty. Source text might be:

@media only screen {
    p {
        color:red;
    }
}
@media only screen and (max-width: 596px) {
    p {
        color:blue;
    }
    img {
        max-width: 200px;
    }
}
@media only screen {

}
img {
    display: block;
}
@media only screen and (max-width: 240px) {
    p {
        color:green;
    }
}
p {
    font-weight: normal;
}

我想将每个媒体查询及其CSS主体捕获为子模式，因此最终得到一个PHP数组，如下所示:

I want to capture each media query and its CSS body as subpatterns, so I'll end up with a PHP array like:

[['@media only screen {
        p {
            color:red;
        }
    }','p {
            color:red;
        }'],...]

关键是，这必须是递归或子例程模式才能平衡花括号.空查询足以混淆

The key thing is that this needs to be a recursive or subroutine pattern in order to balance the braces. The empty query is enough to confuse the pattern in this question because it can't distinguish the end of a css rule from the end of the empty media query:

/@media[^{]+\{([\s\S]+?\})\s*\}/

我一直在尝试并且未能使用本文中的建议形成形式为(b(?:m|(?1))*e)的模式，其中b是开始构造的地方，m是可能在构造的中间发生的事情，而e是可能在构造的末尾发生的事情，都不存在可以匹配同一件事.

I've been trying and failing to use the advice in this article to make a pattern of the form (b(?:m|(?1))*e), where b is what begins the construct, m is what can occur in the middle of the construct, and e is what can occur at the end, and none of them can match the same thing.

因此，b应该是@media[^{]+\{，e应该是\}，并且m需要消耗CSS规则，也许是([^{]+?\{[^}]*?\s*\})，给我:

So, b should be @media[^{]+\{, e should be \}, and m needs to consume CSS rules, perhaps ([^{]+?\{[^}]*?\s*\}), giving me:

/(@media[^{]+\{(?:([^{]+?\{[^}]*?\}\s*)*|(?1))*\})/s

但是，这不起作用，所以我有点迷路了.有人可以提出有效的模式吗?

However, that doesn't work so I'm a bit lost. Can anyone suggest an effective pattern?

我已经在此处进行了正则表达式测试.

I've set up a regex test here.

或者，非正则表达式解析器可能会更好.

Alternatively, a non-regex parser might work better.

请注意，我一般不会尝试验证或匹配CSS选择器(不是用于正则表达式的工作)，而只是获取查询及其主体的内容.

Note that I'm not attempting to validate or match CSS selectors in general (not a job for a regex), just grab the content of the query and its body.

更新添加了更多示例内容，解释了我想了解的内容.

Update added more sample content, explained what I want to get out.

推荐答案

如果您确定要匹配的块始终具有平衡的大括号，则可以将正则表达式与如下子程序一起使用:

If you are sure the blocks you want to match always have a balanced number of braces, you can use a regex with subroutine like this:

'~@media\b[^{]*({((?:[^{}]+|(?1))*)})~'

请参见 regex演示

这是一个 IDEONE演示:

$re = '~@media\b[^{]*({((?:[^{}]+|(?1))*)})~'; 
$str = "@media only screen {\n    p {\n        color:red;\n    }\n}\n@media only screen and (max-width: 596px) {\n    p {\n        color:blue;\n    }\n    img {\n        max-width: 200px;\n    }\n}\n@media only screen {\n\n}\nimg {\n    display: block;\n}\n@media only screen and (max-width: 240px) {\n    p {\n        color:green;\n    }\n}\np {\n    font-weight: normal;\n}"; 
preg_match_all($re, $str, $matches, PREG_PATTERN_ORDER);
print_r($matches[0]);
print_r($matches[2]);

模式详细信息:

@media\b-将@media整个单词匹配(因为\b是单词边界)
[^{]*-匹配除{
({((?:[^{}]+|(?1))*)})-捕获组#1捕获均衡数量的{和}的{...}块(请注意，这是一个技术组，我们需要递归此组子模式才能正确匹配{...}).它匹配...
- {-大括号
- ((?:[^{}]+|(?1))*)-组2(平衡的{...}内部的内容)匹配
  - [^{}]+-{和}以外的1个以上字符(因为我们需要匹配不是前导和尾随定界符的所有字符)
  - |-或...
  - (?1)-整个第1组子模式
  - @media\b - match @media as a whole word (since \b is a word boundary)
  - [^{]* - match 0+ characters other than {
  - ({((?:[^{}]+|(?1))*)}) - a capturing group #1 capturing the {...} blocks with the balanced number of { and } (note it is a technical group, we need to recurse this group subpattern in order to correctly match the {...}s). It matches...
    - { - an opening brace
    - ((?:[^{}]+|(?1))*) - Group 2 (the contents inside the balanced {...}) matching
      - [^{}]+ - 1+ characters other than { and } (because we need to match everything that is not the leading and trailing delimiters)
      - | - or...
      - (?1) - the whole Group 1 subpattern
      请注意，可以使用 preg_match_all('~\s*(\w+)\s*{\s*([^}]*?)\s*}~', $matches[2], $subblocks) 模式对$matches[2]进行进一步处理.
      
      Note that $matches[2] can be further processed with preg_match_all('~\s*(\w+)\s*{\s*([^}]*?)\s*}~', $matches[2], $subblocks) pattern.
      
      这篇关于递归/子例程正则表达式以匹配CSS媒体查询的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

递归/子例程正则表达式以匹配CSS媒体查询 [英] Recursive/subroutine regex to match CSS media queries

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录关闭

递归/子例程正则表达式以匹配CSS媒体查询 [英] Recursive/subroutine regex to match CSS media queries

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录 关闭

登录关闭