仅使用正则表达式提取 json 值 [英] Extract json values using just regex
问题描述
我有一个嵌入在 json 中的描述字段,但我无法利用 json 库来解析这些数据.
I have a description field that is embedded within json and I'm unable to utilize json libraries to parse this data.
我使用 {0,23}
来尝试提取字符串的前 23 个字符,如何提取与描述相关的整个值?
I use {0,23}
in order in attempt to extract first 23 characters of string, how to extract entire value associated with description ?
import re
description = "'\description\" : \"this is a tesdt \n another test\" "
re.findall(r'description(?:\w+){0,23}', description, re.IGNORECASE)
上面的代码只显示['description']
推荐答案
你可以试试这个代码:
import re
description = "description\" : \"this is a tesdt \n another test\" "
result = re.findall(r'(?<=description")(?:\s*\:\s*)(".{0,23}?(?=")")', description, re.IGNORECASE+re.DOTALL)[0]
print(result)
结果如下:
"this is a tesdt
another test"
本质上是:
\"this is a tesdt \n another test\"
这就是您在评论中要求的内容.
And is what you have asked for in the comments.
(?<=description")
是一个正向后视,它告诉正则表达式匹配以 description"
开头的文本(?:\s*\:\s*)
是一个非捕获组,它告诉正则表达式 description"
后面将跟有零个或多个空格,一个冒号 (:
) 和零个或多个空格.(".{0,23}?(?=")")
是实际需要的匹配,由双引号 ("
) 组成, 零到二十三个字符,最后是双引号 ("
).
(?<=description")
is a positive look-behind that tells the regex to match the text preceded by description"
(?:\s*\:\s*)
is a non-capturing group that tells the regex that description"
will be followed by zero-or-more spaces, a colon (:
) and again zero-or-more spaces.
(".{0,23}?(?=")")
is the actual match desired, which consists of a double-quotes ("
), zero-to-twenty three characters, and a double-quotes ("
) at the end.
这篇关于仅使用正则表达式提取 json 值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!