"/usr/lib/python3.6/json/init.py",第296行,在加载返回加载中(fp.read(),MemoryError [英] "/usr/lib/python3.6/json/init.py", line 296, in load return loads(fp.read(), MemoryError
问题描述
我有一个大的json文件(2.4 GB).我想在python中解析它.数据如下所示:
I have a large json file (2.4 GB). I want to parse it in python. The data looks like the following:
[
{
"host": "a.com",
"ip": "1.2.2.3",
"port": 8
},
{
"host": "b.com",
"ip": "2.5.0.4",
"port": 3
},
{
"host": "c.com",
"ip": "9.17.6.7",
"port": 4
}
]
我运行此python脚本parser.py
加载数据进行解析::
I run this python script parser.py
to load the data for parsing::
import json
from pprint import pprint
with open('mydata.json') as f:
data = json.load(f)
回溯(最近一次通话最后一次):文件"parser.py",位于第xx行 data = json.load(f)载入中的文件"/usr/lib/python3.6/json/init.py",第296行 返回负载(fp.read(),MemoryError
Traceback (most recent call last): File "parser.py", line xx, in data = json.load(f) File "/usr/lib/python3.6/json/init.py", line 296, in load return loads(fp.read(), MemoryError
1)您能否建议我如何加载大文件进行解析,而不会出现此类错误?
1) Can you please advise me how to load large files for parsing without such an error?
2)还有其他方法吗?
2) Any alternative methods?
推荐答案
问题是因为文件太大,无法加载到程序中,因此必须一次加载各部分.
我建议使用 ijson 或
The problem is because the file is too large to load into the program, so you must load in sections at a time.
I would recommend using ijson or json-streamer which can load in the json file iteratively instead of trying to load the whole file into memory at once.
这是使用ijson的示例:
Here's an example of using ijson:
import ijson
entry = {} # Keeps track of values for each json item
parser = ijson.parse(open('mydata.json'))
for prefix, event, value in parser:
# Start of item map
if (prefix, event) == ('item', 'start_map'):
entry = {} # Start of a new json item
elif prefix.endswith('.host'):
entry['host'] = value # Add value to entry
elif prefix.endswith('.ip'):
entry['ip'] = value
elif prefix.endswith('.port'):
entry['port'] = value
elif (prefix, event) == ('item', 'end_map'):
print(entry) # Do something with complete entry object
每个prefix
在json中存储要插入的当前项目的前缀路径. event
用于检测映射或数组的开始/结束. value
用于存储要迭代的当前对象的值.
Each prefix
stores the prefix path for the current item being interated in the json. The event
is used to detect the start/end of maps or arrays. And the value
is used to store the value of the current object being iterated on.
这篇关于"/usr/lib/python3.6/json/init.py",第296行,在加载返回加载中(fp.read(),MemoryError的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!