仅使用正则表达式提取 json 值 [英] Extract json values using just regex

查看:56
本文介绍了仅使用正则表达式提取 json 值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个嵌入在 json 中的描述字段,但我无法利用 json 库来解析这些数据.

I have a description field that is embedded within json and I'm unable to utilize json libraries to parse this data.

我使用 {0,23} 来尝试提取字符串的前 23 个字符,如何提取与描述相关的整个值?

I use {0,23} in order in attempt to extract first 23 characters of string, how to extract entire value associated with description ?

   import re

    description = "'\description\" : \"this is a tesdt \n another test\" "

    re.findall(r'description(?:\w+){0,23}', description, re.IGNORECASE)

上面的代码只显示['description']

推荐答案

你可以试试这个代码:

import re

description = "description\" : \"this is a tesdt \n another test\" "

result = re.findall(r'(?<=description")(?:\s*\:\s*)(".{0,23}?(?=")")', description, re.IGNORECASE+re.DOTALL)[0]

print(result)

结果如下:

"this is a tesdt 
 another test"

本质上是:

\"this is a tesdt \n another test\"

这就是您在评论中要求的内容.

And is what you have asked for in the comments.

(?<=description") 是一个正向后视,它告诉正则表达式匹配以 description"
开头的文本(?:\s*\:\s*) 是一个非捕获组,它告诉正则表达式 description" 后面将跟有零个或多个空格,一个冒号 (:) 和零个或多个空格.
(".{0,23}?(?=")") 是实际需要的匹配,由双引号 (") 组成, 零到二十三个字符,最后是双引号 (").

(?<=description") is a positive look-behind that tells the regex to match the text preceded by description"
(?:\s*\:\s*) is a non-capturing group that tells the regex that description" will be followed by zero-or-more spaces, a colon (:) and again zero-or-more spaces.
(".{0,23}?(?=")") is the actual match desired, which consists of a double-quotes ("), zero-to-twenty three characters, and a double-quotes (") at the end.

这篇关于仅使用正则表达式提取 json 值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆