在PHP中解析命令参数 [英] Parsing command arguments in PHP
问题描述
是否存在本机"PHP方式"来解析string
中的命令参数?例如,给出以下string
:
Is there a native "PHP way" to parse command arguments from a string
? For example, given the following string
:
foo "bar \"baz\"" '\'quux\''
我想创建以下array
:
array(3) {
[0] =>
string(3) "foo"
[1] =>
string(7) "bar "baz""
[2] =>
string(6) "'quux'"
}
我已经尝试使用 token_get_all()
,但是PHP是变量插值语法(例如"foo ${bar} baz"
)在我的游行中几乎下雨了.
I've already tried to leverage token_get_all()
, but PHP's variable interpolation syntax (e.g. "foo ${bar} baz"
) pretty much rained on my parade.
我非常了解我可以编写自己的解析器.命令参数语法非常简单,但是如果有一种现有的本机方式,我宁愿使用它而不是自己动手.
I know full well that I could write my own parser. Command argument syntax is super simplistic, but if there's an existing native way to do it, I'd much prefer that over rolling my own.
请注意,我正在尝试从string
而不是从shell/命令行解析参数.
Please note that I am looking to parse the arguments from a string
, NOT from the shell/command-line.
编辑#2:以下是预期输入->参数输出的更全面示例:
foo -> foo
"foo" -> foo
'foo' -> foo
"foo'foo" -> foo'foo
'foo"foo' -> foo"foo
"foo\"foo" -> foo"foo
'foo\'foo' -> foo'foo
"foo\foo" -> foo\foo
"foo\\foo" -> foo\foo
"foo foo" -> foo foo
'foo foo' -> foo foo
推荐答案
正则表达式非常强大:(?s)(?<!\\)("|')(?:[^\\]|\\.)*?\1|\S+
.那么这个表达是什么意思呢?
Regexes are quite powerful: (?s)(?<!\\)("|')(?:[^\\]|\\.)*?\1|\S+
. So what does this expression mean ?
-
(?s)
:设置s
修饰符以将换行符与点.
匹配
-
(?<!\\)
:向后看是否为负,请检查下一个标记之前是否没有反斜杠 -
("|')
:匹配单引号或双引号并将其放在组1中 -
(?:[^\\]|\\.)*?
:匹配所有非\,或将\与紧随其后的(转义)字符匹配 -
\1
:匹配第一组中匹配的内容 -
|
:或 -
\S+
:匹配空白以外的任何内容一次或多次.
(?s)
: set thes
modifier to match newlines with a dot.
(?<!\\)
: negative lookbehind, check if there is no backslash preceding the next token("|')
: match a single or double quote and put it in group 1(?:[^\\]|\\.)*?
: match everything not \, or match \ with the immediately following (escaped) character\1
: match what is matched in the first group|
: or\S+
: match anything except whitespace one or more times.
这个想法是捕获一个报价并将其分组以记住它是单引号还是双引号.否定的回溯是为了确保我们不匹配转义的引号. \1
用于匹配第二对引号.最后,我们使用交替来匹配不是空格的任何内容.该解决方案非常方便,几乎适用于支持后向和反向引用的任何语言/风格.当然,此解决方案希望引号是封闭的.结果在第0组中.
The idea is to capture a quote and group it to remember if it's a single or a double one. The negative lookbehinds are there to make sure we don't match escaped quotes. \1
is used to match the second pair of quotes. Finally we use an alternation to match anything that's not a whitespace. This solution is handy and is almost applicable for any language/flavor that supports lookbehinds and backreferences. Of course, this solution expects that the quotes are closed. The results are found in group 0.
让我们在PHP中实现它:
Let's implement it in PHP:
$string = <<<INPUT
foo "bar \"baz\"" '\'quux\''
'foo"bar' "baz'boz"
hello "regex
world\""
"escaped escape\\\\"
INPUT;
preg_match_all('#(?<!\\\\)("|\')(?:[^\\\\]|\\\\.)*?\1|\S+#s', $string, $matches);
print_r($matches[0]);
如果您想知道为什么我要使用4个反斜杠.然后看看我的上一个答案.
If you wonder why I used 4 backslashes. Then take a look at my previous answer.
输出
Array
(
[0] => foo
[1] => "bar \"baz\""
[2] => '\'quux\''
[3] => 'foo"bar'
[4] => "baz'boz"
[5] => hello
[6] => "regex
world\""
[7] => "escaped escape\\"
)
nbsp; b ; > 在线正则表达式演示 nbsp; bsp > 在线php演示
Online regex demo Online php demo
删除引号
使用命名组和简单循环非常简单:
Quite simple using named groups and a simple loop:
preg_match_all('#(?<!\\\\)("|\')(?<escaped>(?:[^\\\\]|\\\\.)*?)\1|(?<unescaped>\S+)#s', $string, $matches, PREG_SET_ORDER);
$results = array();
foreach($matches as $array){
if(!empty($array['escaped'])){
$results[] = $array['escaped'];
}else{
$results[] = $array['unescaped'];
}
}
print_r($results);
这篇关于在PHP中解析命令参数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!