在Python中处理惰性JSON-“期望属性名称" [英] Handling lazy JSON in Python - 'Expecting property name'

查看:100
本文介绍了在Python中处理惰性JSON-“期望属性名称"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用Python(2.7)'json'模块,我正在处理各种JSON feed.不幸的是,其中一些提要不符合JSON标准-特别是某些键没有用双语音标记()包裹.这导致Python出现错误.

Using Pythons (2.7) 'json' module I'm looking to process various JSON feeds. Unfortunately some of these feeds do not conform with JSON standards - in specific some keys are not wrapped in double speech-marks ("). This is causing Python to bug out.

在编写一段难看的代码来解析和修复传入的数据之前,我想我会问-是否有任何方法可以让Python解析此格式错误的JSON或修复"数据,以便这将是有效的JSON?

Before writing an ugly-as-hell piece of code to parse and repair the incoming data, I thought I'd ask - is there any way to allow Python to either parse this malformed JSON or 'repair' the data so that it would be valid JSON?

工作示例

import json
>>> json.loads('{"key1":1,"key2":2,"key3":3}')
{'key3': 3, 'key2': 2, 'key1': 1}

残破的示例

import json
>>> json.loads('{key1:1,key2:2,key3:3}')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python27\lib\json\__init__.py", line 310, in loads
    return _default_decoder.decode(s)
  File "C:\Python27\lib\json\decoder.py", line 346, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "C:\Python27\lib\json\decoder.py", line 362, in raw_decode
    obj, end = self.scan_once(s, idx)
ValueError: Expecting property name: line 1 column 1 (char 1)

我写了一个小的REGEX来修复来自此特定提供程序的JSON,但我认为这将是将来的问题.以下是我的想法.

I've written a small REGEX to fix the JSON coming from this particular provider, but I forsee this being an issue in the future. Below is what I came up with.

>>> import re
>>> s = '{key1:1,key2:2,key3:3}'
>>> s = re.sub('([{,])([^{:\s"]*):', lambda m: '%s"%s":'%(m.group(1),m.group(2)),s)
>>> s
'{"key1":1,"key2":2,"key3":3}'

推荐答案

您正在尝试使用JSON解析器来解析不是JSON的内容.最好的选择是让提要的创建者对其进行修复.

You're trying to use a JSON parser to parse something that isn't JSON. Your best bet is to get the creator of the feeds to fix them.

我知道这并不总是可能的.您可能可以使用正则表达式来修复数据,具体取决于其破坏程度:

I understand that isn't always possible. You might be able to fix the data using regexes, depending on how broken it is:

j = re.sub(r"{\s*(\w)", r'{"\1', j)
j = re.sub(r",\s*(\w)", r',"\1', j)
j = re.sub(r"(\w):", r'\1":', j)

这篇关于在Python中处理惰性JSON-“期望属性名称"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆