Python-使用正则表达式解析JSON格式的文本文件 [英] Python - Parsing JSON formatted text file with regex

查看：310 发布时间：2021/2/13 20:05:26 python json regex python-2.7

本文介绍了Python-使用正则表达式解析JSON格式的文本文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个文本文件，格式类似于JSON文件，但是所有内容都在一行中(可以是MongoDB文件).有人可以指出我如何使用Python regex方法提取值的方向吗?

I have a text file formatted like a JSON file however everything is on a single line (could be a MongoDB File). Could someone please point me in the direction of how I could extract values using a Python regex method please?

文本显示如下:

{"d":{"__type":"WikiFileNodeContent:http:\/\/samplesite.com.‌au\/ns\/business\/wi‌ki","author":null,"d‌escription":null,"fi‌leAssetId":"034b9317‌-60d9-45c2-b6d6-0f24‌b59e1991","filename"‌:"Reports.pdf"},"cre‌atedBy":1531,"create‌dByUsername":"John Cash","icon":"\/Assets10.37.5.0\/pix\/16x16\/page_white_acro‌bat.png","id":3041,"‌inheritedPermissions‌":false,"name":"map"‌,"permissions":[23,8‌7,35,49,65],"type":3‌,"viewLevel":2},{"__‌type":"WikiNode:http‌:\/\/samplesite.com.‌au\/ns\/business\/wi‌ki","children":[],"c‌ontent":

我想获取"fileAssetId"和文件名".我尝试用Python的JSON模块加载类似内容，但出现错误

I am wanting to get the "fileAssetId" and filename". Ive tried to load the like with Pythons JSON module but I get an error

对于FileAssetid，我尝试了此正则表达式:

For the FileAssetid I tried this regex:

regex = re.compile(r"([0-9a-f]{8})\S*-\S*([0-9a-f]{4})\S*-\S*([0-9a-f]{4})\S*-\S*([0-9a-f]{4})\S*-\S*([0-9a-f]{12})")

但是我得到以下034b9317‌，60d9、45c2，b6d6、0f24‌b59e1991

But i get the following 034b9317‌, 60d9, 45c2, b6d6, 0f24‌b59e1991

我不确定如何获取显示的数据.

Im not to sure how to get the data as its displayed.

推荐答案

如何使用正向先行和向后看:

How about using positive lookahead and lookbehind:

(?<=\"fileAssetId\":\")[a-fA-F0-9-]+?(?=\")

捕获fileAssetId和

(?<=\"filename\":\").+?(?=\")

与文件名匹配.

有关正则表达式的详细说明，请参见 Regex101 -示例. (注意:在示例中，我将两者与OR-Operator |组合在一起以同时显示两个匹配项)

For a detailed explanation of the regex have a look at the Regex101-Example. (Note: I combined both in the example with an OR-Operator | to show both matches at once)

要获取所有匹配项的列表，请使用re.findall或re.finditer而不是re.match.

To get a list of all matches use re.findall or re.finditer instead of re.match.

re.findall(pattern, string)返回匹配字符串的列表.

re.findall(pattern, string) returns a list of matching strings.

re.finditer(pattern, string)返回带有对象的迭代器.

re.finditer(pattern, string) returns an iterator with the objects.

这篇关于Python-使用正则表达式解析JSON格式的文本文件的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Python-使用正则表达式解析JSON格式的文本文件 [英] Python - Parsing JSON formatted text file with regex

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python-使用正则表达式解析JSON格式的文本文件 [英] Python - Parsing JSON formatted text file with regex

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭