递归/子例程正则表达式以匹配CSS媒体查询 [英] Recursive/subroutine regex to match CSS media queries
问题描述
我正在寻找一个正则表达式(在PHP PCRE中),该正则表达式可以可靠地匹配媒体查询及其内容,包括媒体查询主体为空的情况有些奇怪.源文本可能是:
I'm looking for a regular expression (in PHP PCRE) that can match media queries and their contents reliably, including the somewhat odd case where a media query body is empty. Source text might be:
@media only screen {
p {
color:red;
}
}
@media only screen and (max-width: 596px) {
p {
color:blue;
}
img {
max-width: 200px;
}
}
@media only screen {
}
img {
display: block;
}
@media only screen and (max-width: 240px) {
p {
color:green;
}
}
p {
font-weight: normal;
}
我想将每个媒体查询及其CSS主体捕获为子模式,因此最终得到一个PHP数组,如下所示:
I want to capture each media query and its CSS body as subpatterns, so I'll end up with a PHP array like:
[['@media only screen {
p {
color:red;
}
}','p {
color:red;
}'],...]
关键是,这必须是递归或子例程模式才能平衡花括号.空查询足以混淆
The key thing is that this needs to be a recursive or subroutine pattern in order to balance the braces. The empty query is enough to confuse the pattern in this question because it can't distinguish the end of a css rule from the end of the empty media query:
/@media[^{]+\{([\s\S]+?\})\s*\}/
我一直在尝试并且未能使用本文中的建议形成形式为(b(?:m|(?1))*e)
的模式,其中b
是开始构造的地方,m
是可能在构造的中间发生的事情,而e
是可能在构造的末尾发生的事情,都不存在可以匹配同一件事.
I've been trying and failing to use the advice in this article to make a pattern of the form (b(?:m|(?1))*e)
, where b
is what begins the construct, m
is what can occur in the middle of the construct, and e
is what can occur at the end, and none of them can match the same thing.
因此,b
应该是@media[^{]+\{
,e
应该是\}
,并且m
需要消耗CSS规则,也许是([^{]+?\{[^}]*?\s*\})
,给我:
So, b
should be @media[^{]+\{
, e
should be \}
, and m
needs to consume CSS rules, perhaps ([^{]+?\{[^}]*?\s*\})
, giving me:
/(@media[^{]+\{(?:([^{]+?\{[^}]*?\}\s*)*|(?1))*\})/s
但是,这不起作用,所以我有点迷路了.有人可以提出有效的模式吗?
However, that doesn't work so I'm a bit lost. Can anyone suggest an effective pattern?
我已经在此处进行了正则表达式测试.
I've set up a regex test here.
或者,非正则表达式解析器可能会更好.
Alternatively, a non-regex parser might work better.
请注意,我一般不会尝试验证或匹配CSS选择器(不是用于正则表达式的工作),而只是获取查询及其主体的内容.
Note that I'm not attempting to validate or match CSS selectors in general (not a job for a regex), just grab the content of the query and its body.
更新添加了更多示例内容,解释了我想了解的内容.
Update added more sample content, explained what I want to get out.
推荐答案
如果您确定要匹配的块始终具有平衡的大括号,则可以将正则表达式与如下子程序一起使用:
If you are sure the blocks you want to match always have a balanced number of braces, you can use a regex with subroutine like this:
'~@media\b[^{]*({((?:[^{}]+|(?1))*)})~'
请参见 regex演示
这是一个 IDEONE演示:
$re = '~@media\b[^{]*({((?:[^{}]+|(?1))*)})~';
$str = "@media only screen {\n p {\n color:red;\n }\n}\n@media only screen and (max-width: 596px) {\n p {\n color:blue;\n }\n img {\n max-width: 200px;\n }\n}\n@media only screen {\n\n}\nimg {\n display: block;\n}\n@media only screen and (max-width: 240px) {\n p {\n color:green;\n }\n}\np {\n font-weight: normal;\n}";
preg_match_all($re, $str, $matches, PREG_PATTERN_ORDER);
print_r($matches[0]);
print_r($matches[2]);
模式详细信息:
-
@media\b
-将@media
整个单词匹配(因为\b
是单词边界) -
[^{]*
-匹配除{
之外的0+个字符
-
({((?:[^{}]+|(?1))*)})
-捕获组#1捕获均衡数量的{
和}
的{...}
块(请注意,这是一个技术组,我们需要递归此组子模式才能正确匹配{...}
).它匹配...-
{
-大括号 -
((?:[^{}]+|(?1))*)
-组2(平衡的{...}
内部的内容)匹配-
[^{}]+
-{
和}
以外的1个以上字符(因为我们需要匹配不是前导和尾随定界符的所有字符) -
|
-或... -
(?1)
-整个第1组子模式
@media\b
- match@media
as a whole word (since\b
is a word boundary)[^{]*
- match 0+ characters other than{
({((?:[^{}]+|(?1))*)})
- a capturing group #1 capturing the{...}
blocks with the balanced number of{
and}
(note it is a technical group, we need to recurse this group subpattern in order to correctly match the{...}
s). It matches...{
- an opening brace((?:[^{}]+|(?1))*)
- Group 2 (the contents inside the balanced{...}
) matching[^{}]+
- 1+ characters other than{
and}
(because we need to match everything that is not the leading and trailing delimiters)|
- or...(?1)
- the whole Group 1 subpattern
请注意,可以使用
preg_match_all('~\s*(\w+)\s*{\s*([^}]*?)\s*}~', $matches[2], $subblocks)
模式对$matches[2]
进行进一步处理.Note that
$matches[2]
can be further processed withpreg_match_all('~\s*(\w+)\s*{\s*([^}]*?)\s*}~', $matches[2], $subblocks)
pattern.这篇关于递归/子例程正则表达式以匹配CSS媒体查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
-
-