在Python中读取具有多个对象的JSON文件 [英] Reading the JSON File with multiple objects in Python

查看:335
本文介绍了在Python中读取具有多个对象的JSON文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在编程和Python方面有点白痴.我知道这些是以前有关此问题的很多解释,但我仔细阅读了所有这些内容,但没有找到解决方案.
我正在尝试读取一个JSON文件,其中包含约10亿个数据,如下所示:

I'm a bit idiot in programming and Python. I know that these are a lot of explanations in previous questions about this but I carefully read all of them and I didn't find the solution.
I'm trying to read a JSON file which contains about 1 billion of data like this:

334465|{"color":"33ef","age":"55","gender":"m"}
334477|{"color":"3444","age":"56","gender":"f"}
334477|{"color":"3999","age":"70","gender":"m"}

我正在努力克服每一行开头的6位数字,但是我不知道如何读取多个JSON对象? 这是我的代码,但是我找不到为什么不起作用?

I was trying hard to overcome that 6 digit numbers at the beginning of each line, but I dont know how can I read multiple JSON objects? Here is my code but I can't find why it is not working?

import json

T =[]
s = open('simple.json', 'r')
ss = s.read()
for line in ss:
    line = ss[7:]
    T.append(json.loads(line))
s.close()

这是我得到的错误:

ValueError: Extra Data: line 3 column 1 - line 5 column 48 (char 42 - 138)

任何建议对我都会很有帮助!

Any suggestion would be very helpful for me!

推荐答案

您的代码逻辑存在几个问题.

There are several problems with the logic of your code.

ss = s.read()

将整个文件s读取为单个字符串.下一行

reads the entire file s into a single string. The next line

for line in ss:

逐个迭代该字符串中的每个字符.因此,每个循环line都是单个字符.在

iterates over each character in that string, one by one. So on each loop line is a single character. In

    line = ss[7:]

您将获得除前7个字符(位置0到6,包括首尾)之外的整个文件内容,并用该内容替换line的先前内容.然后

you are getting the entire file contents apart from the first 7 characters (in positions 0 through 6, inclusive) and replacing the previous content of line with that. And then

T.append(json.loads(line))

尝试将其转换为JSON并将结果对象存储在T列表中.

attempts to convert that to JSON and store the resulting object into the T list.

这里有一些代码可以满足您的需求.我们不需要使用.read将整个文件读取为字符串,或者使用.readlines将其读取为行列表,我们只需将文件句柄放入for循环中,即可逐行遍历文件

Here's some code that does what you want. We don't need to read the entire file into a string with .read, or into a list of lines with .readlines, we can simply put the file handle into a for loop and that will iterate over the file line by line.

我们使用with语句打开文件,以便在退出with块或出现IO错误时,它将自动关闭.

We use a with statement to open the file, so that it will get closed automatically when we exit the with block, or if there's an IO error.

import json

table = []
with open('simple.json', 'r') as f:
    for line in f:
        table.append(json.loads(line[7:]))

for row in table:
    print(row)

输出

{'color': '33ef', 'age': '55', 'gender': 'm'}
{'color': '3444', 'age': '56', 'gender': 'f'}
{'color': '3999', 'age': '70', 'gender': 'm'}

我们可以通过在列表理解中构建table列表来使其更加紧凑:

We can make this more compact by building the table list in a list comprehension:

import json

with open('simple.json', 'r') as f:
    table = [json.loads(line[7:]) for line in f]

for row in table:
    print(row)

这篇关于在Python中读取具有多个对象的JSON文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆