加载和解析具有多个JSON对象的JSON文件 [英] Loading and parsing a JSON file with multiple JSON objects

查看:188
本文介绍了加载和解析具有多个JSON对象的JSON文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在 Python 中加载和解析JSON文件.但是我在尝试加载文件时遇到了困难:

I am trying to load and parse a JSON file in Python. But I'm stuck trying to load the file:

import json
json_data = open('file')
data = json.load(json_data)

收益:

ValueError: Extra data: line 2 column 1 - line 225116 column 1 (char 232 - 160128774)

我看了 18.2. json — Python文档中的JSON编码器和解码器 ,但阅读此可怕文档非常令人沮丧.

I looked at 18.2. json — JSON encoder and decoder in the Python documentation, but it's pretty discouraging to read through this horrible-looking documentation.

前几行(用随机条目匿名):

First few lines (anonymized with randomized entries):

{"votes": {"funny": 2, "useful": 5, "cool": 1}, "user_id": "harveydennis", "name": "Jasmine Graham", "url": "http://example.org/user_details?userid=harveydennis", "average_stars": 3.5, "review_count": 12, "type": "user"}
{"votes": {"funny": 1, "useful": 2, "cool": 4}, "user_id": "njohnson", "name": "Zachary Ballard", "url": "https://www.example.com/user_details?userid=njohnson", "average_stars": 3.5, "review_count": 12, "type": "user"}
{"votes": {"funny": 1, "useful": 0, "cool": 4}, "user_id": "david06", "name": "Jonathan George", "url": "https://example.com/user_details?userid=david06", "average_stars": 3.5, "review_count": 12, "type": "user"}
{"votes": {"funny": 6, "useful": 5, "cool": 0}, "user_id": "santiagoerika", "name": "Amanda Taylor", "url": "https://www.example.com/user_details?userid=santiagoerika", "average_stars": 3.5, "review_count": 12, "type": "user"}
{"votes": {"funny": 1, "useful": 8, "cool": 2}, "user_id": "rodriguezdennis", "name": "Jennifer Roach", "url": "http://www.example.com/user_details?userid=rodriguezdennis", "average_stars": 3.5, "review_count": 12, "type": "user"}

推荐答案

您有一个 JSON行格式文本文件.您需要逐行解析文件:

You have a JSON Lines format text file. You need to parse your file line by line:

import json

data = []
with open('file') as f:
    for line in f:
        data.append(json.loads(line))

每条 line 都包含有效的JSON,但总体而言,它不是有效的JSON值,因为没有顶级列表或对象定义.

Each line contains valid JSON, but as a whole, it is not a valid JSON value as there is no top-level list or object definition.

请注意,因为该文件每行包含JSON,所以您无需费力地尝试一次性分析所有内容或弄清流式JSON解析器.现在,您可以选择在继续进行下一行之前分别处理每一行,从而节省了进程中的内存.如果文件很大,您可能不想将每个结果附加到一个列表中,然后然后处理所有内容.

Note that because the file contains JSON per line, you are saved the headaches of trying to parse it all in one go or to figure out a streaming JSON parser. You can now opt to process each line separately before moving on to the next, saving memory in the process. You probably don't want to append each result to one list and then process everything if your file is really big.

如果您有一个文件,其中包含带有分隔符的单个JSON对象,请使用解析出一个使用缓冲方法的对象.

If you have a file containing individual JSON objects with delimiters in-between, use How do I use the 'json' module to read in one JSON object at a time? to parse out individual objects using a buffered method.

这篇关于加载和解析具有多个JSON对象的JSON文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆