正则表达式匹配键值对,其中值在引号或撇号中 [英] Regular expression to match key-value pairs where value is in quotes or apostrophes

查看:128
本文介绍了正则表达式匹配键值对,其中值在引号或撇号中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在接下来的 2 周内完成一个 PHP 应用程序,但我无法找出解析某些属性字符串的正则表达式.

I'm trying to complete a PHP app in the next 2 weeks and I just can't figure out the regular expression to parse some attribute strings.

我得到的随机字符串的格式如下:

I get random strings that are in the format of like this string:

KeyName1="KeyValue1" KeyName2='KeyValue2'

单个字符串中可能有任意数量的键值对,值可以由单引号 ' 或双引号 " 以任意组合分隔字符串(但它们总是被分隔).

There may be any number of key value pairs in a single string and the values can be delimited by either single quotes ' or double quotes " in any combination within one string (but they are always delimited).

键值可以是任意长度并包含任何字符,但双引号不能在双引号内,单引号不能在单引号内,但双引号可以在单引号内,单引号可以在双引号内.

The key values can be of any lengths and contain any character except double quotes can't be inside double quotes and a single quotes can't be inside single quotes, but double quotes can be inside single quotes and single quotes can be inside double quotes.

键值对之间可以有任意数量的空格,键名和等号之间以及等号和开始键值的引号之间可以有任意数量的空格.

The key value pairs can have any number of spaces between them and any number of spaces between the key name and the equal sign and the equal sign and the quote character that starts the key value.

我需要将字符串变成一个数组,如下所示:

I need to turn the string into an array that looks like:

$arrayName["KeyName1"] = "KeyValue1"
$arrayName["KeyName2"] = "KeyValue2"

我很确定它可以用正则表达式来完成,但我所有的尝试都失败了,我需要一些帮助(实际上是很多帮助:-) 来完成这件事,我希望这里的一些了不起的人可以提供帮助或至少让我开始.

I'm pretty sure it can be done with regular expressions but all my attempts have failed and I need some help (actually lots of help :-) to get this done and am hoping some of the amazing people here can provide that help or at least get me started.

推荐答案

好的,没问题.让我们分解一下:

Sure, no problem. Let's break it down:

\w+\s*=\s*

匹配一个字母数字关键字,后跟一个等号(可能被空格包围).

matches an alphanumeric keyword, followed by an equals sign (which might be surrounded by whitespace).

"[^"]*"

匹配一个开始双引号,后跟任意数量的字符,除了另一个双引号,然后是(结束)双引号.

matches an opening double quote, followed by any number of characters except another double quote, then a (closing) double quote.

'[^']*'

对单引号字符串做同样的事情.

does the same for single quoted strings.

将使用捕获组 ((...)) 与简单的交替 (|) 相结合,为您提供

Combining that using capturing groups ((...)) with a simple alternation (|) gives you

(\w+)\s*=\s*("[^"]*"|'[^']*')

在 PHP 中:

preg_match_all('/(\w+)\s*=\s*("[^"]*"|\'[^\']*\')/', $subject, $result, PREG_SET_ORDER);

用匹配数组填充 $result.$result[n] 将包含第 n 次匹配的详细信息,其中

fills $result with an array of matches. $result[n] will contain the details of the nth match, where

  • $result[n][0] 是整个匹配
  • $result[n][1] 包含关键字
  • $result[n][2] 包含值(包括引号)
  • $result[n][0] is the entire match
  • $result[n][1] contains the keyword
  • $result[n][2] contains the value (including quotes)

要匹配不带引号的值部分,无论使用哪种引号,您都需要一个稍微复杂一些的正则表达式,它使用 正向前瞻断言:

To match the value part without its quotes, regardless of the kind of quotes that are used, you need a slightly more complicated regex that uses a positive lookahead assertion:

(\w+)\s*=\s*(["'])((?:(?!\2).)*)\2

在 PHP 中:

preg_match_all('/(\w+)\s*=\s*(["\'])((?:(?!\2).)*)\2/', $subject, $result, PREG_SET_ORDER);

结果

  • $result[n][0]:整个匹配
  • $result[n][1]:关键字
  • $result[n][2]:引号字符
  • $result[n][3]:值
  • $result[n][0]: entire match
  • $result[n][1]: keyword
  • $result[n][2]: quote character
  • $result[n][3]: value

说明:

(["'])    # Match a quote (--> group 2)
(         # Match and capture --> group 3...
 (?:      # the following regex:
  (?!\2)  # As long as the next character isn't the one in group 2,
  .       # match it (any character)
 )*       # any number of times.
)         # End of capturing group 3
\2        # Then match the corresponding quote character.

这篇关于正则表达式匹配键值对,其中值在引号或撇号中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆