使用Python发出解析多行JSON文件的问题 [英] Issue parsing multiline JSON file using Python

查看:165
本文介绍了使用Python发出解析多行JSON文件的问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用Python 2.7中的json库解析JSON多行文件.下面给出了一个简化的示例文件:

I am trying to parse a JSON multiline file using json library in Python 2.7. A simplified sample file is given below:

{
"observations": {
    "notice": [
        {
            "copyright": "Copyright Commonwealth of Australia 2015, Bureau of Meteorology. For more information see: http://www.bom.gov.au/other/copyright.shtml http://www.bom.gov.au/other/disclaimer.shtml",
            "copyright_url": "http://www.bom.gov.au/other/copyright.shtml",
            "disclaimer_url": "http://www.bom.gov.au/other/disclaimer.shtml",
            "feedback_url": "http://www.bom.gov.au/other/feedback"
        }
    ]
}
}

我的代码如下:

import json

with open('test.json', 'r') as jsonFile:
    for jf in jsonFile:
        jf = jf.replace('\n', '')
        jf = jf.strip()
        weatherData = json.loads(jf)
        print weatherData

尽管如此,我还是收到如下错误:

Nevertheless, I get an error as shown below:

Traceback (most recent call last):
File "test.py", line 8, in <module>
weatherData = json.loads(jf)
File "/home/usr/anaconda2/lib/python2.7/json/__init__.py", line 339, in loads
return _default_decoder.decode(s)
File "/home/usr/anaconda2/lib/python2.7/json/decoder.py", line 364, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/home/usr/anaconda2/lib/python2.7/json/decoder.py", line 380, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: Expecting object: line 1 column 1 (char 0)

只是做一些测试,所以我修改了代码,以便在删除换行符并去除开头和结尾的空白之后,将内容写入另一个文件(具有json扩展名).令人惊讶的是,当我读回后一个文件时,我没有收到任何错误并且解析成功.修改后的代码如下:

Just to do some testing, I modified the code such that after removing newlines and striping away the leading and trailing white spaces, I write the contents to another file (with the json extension). Surprisingly, when I read back the latter file, I do not get any error and the parsing is successful. The modified code is as follows:

import json

filewrite = open('out.json', 'w+')

with open('test.json', 'r') as jsonFile:
    for jf in jsonFile:
        jf = jf.replace('\n', '')
        jf = jf.strip()
        filewrite.write(jf)

filewrite.close()

with open('out.json', 'r') as newJsonFile:
    for line in newJsonFile:
        weatherData = json.loads(line)
        print weatherData

输出如下:

{u'observations': {u'notice': [{u'copyright_url': u'http://www.bom.gov.au/other/copyright.shtml', u'disclaimer_url': u'http://www.bom.gov.au/other/disclaimer.shtml', u'copyright': u'Copyright Commonwealth of Australia 2015, Bureau of Meteorology. For more information see: http://www.bom.gov.au/other/copyright.shtml http://www.bom.gov.au/other/disclaimer.shtml', u'feedback_url': u'http://www.bom.gov.au/other/feedback'}]}}

您知道在使用json库之前删除新行和空白会发生什么情况吗?

Any idea what might be going on when new lines and white spaces are stripped before using json library?

推荐答案

如果尝试逐行解析json文件,您会变得疯狂. json模块具有帮助程序方法以直接读取文件对象或字符串,即loadloads方法. load接受包含json数据的文件的文件对象(如下所示),而loads接受包含json数据的字符串.

You will go crazy if you try to parse a json file line by line. The json module has helper methods to read file objects directly or strings i.e. the load and loads methods. load takes a file object (as shown below) for a file that contains json data, while loads takes a string that contains json data.

选项1:-首选

import json
with open('test.json', 'r') as jf:
    weatherData = json.load(jf)
    print weatherData

选项2:

import json
with open('test.json', 'r') as jf:
    weatherData = json.loads(jf.read())
    print weatherData

如果您正在寻找性能更高的json解析,请查看 ujson

If you are looking for higher performance json parsing check out ujson

这篇关于使用Python发出解析多行JSON文件的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆