带有json.dump的JSON对象之间的逗号分隔符 [英] Comma separator between JSON objects with json.dump
问题描述
我在摆弄一个json文件,其中包含目录中文件的某些属性.我的问题是,当追加到文件时,每个对象之间没有分隔符.我可以在每个'f'之后添加一个逗号,然后删除最后一个,但是对我来说这似乎是个草率的工作.
I am fiddling around with outputting a json file with some attributes of the files within a directory. My problem is, when appending to the file there is no separator between each object. I could just add a comma after each 'f' and delete the last one, but that seems like a sloppy work around to me.
import os
import os.path
import json
#Create and open file_data.txt and append
with open('file_data.txt', 'a') as outfile:
files = os.listdir(os.curdir)
for f in files:
extension = os.path.splitext(f)[1][1:]
base = os.path.splitext(f)[0]
name = f
data = {
"file_name" : name,
"extension" : extension,
"base_name" : base
}
json.dump(data, outfile)
这将输出:
{"file_name": "contributors.txt", "base_name": "contributors", "extension": "txt"}{"file_name": "read_files.py", "base_name": "read_files", "extension": "py"}{"file_name": "file_data.txt", "base_name": "file_data", "extension": "txt"}{"file_name": ".git", "base_name": ".git", "extension": ""}
我想要的是实际的JSON:
What I would like is actual JSON:
{"file_name": "contributors.txt", "base_name": "contributors", "extension": "txt"},{"file_name": "read_files.py", "base_name": "read_files", "extension": "py"},{"file_name": "file_data.txt", "base_name": "file_data", "extension": "txt"}{"file_name": ".git", "base_name": ".git", "extension": ""}
推荐答案
所获得的不是JSON对象,而是一系列单独的JSON对象.
What you're getting is not a JSON object, but a stream of separate JSON objects.
您想要的是 still 而不是JSON对象,而是一个单独的JSON对象流,它们之间用逗号分隔.不会再解析了.*
What you would like is still not a JSON object, but a stream of separate JSON objects with commas between them. That's not going to be any more parseable.*
* JSON规范非常简单,可以手工解析,并且应该很清楚后面跟着一个逗号之间的另一个对象与任何有效的产生都不匹配.
* The JSON spec is simple enough to parse by hand, and it should be pretty clear that an object followed by another object with a comma in between doesn't match any valid production.
如果您尝试创建JSON数组,则可以执行此操作.除非存在内存问题,否则显而易见的方法是构建字典列表,然后一次将其全部转储:
If you're trying to create a JSON array, you can do that. The obvious way, unless there are memory issues, is to build a list of dicts, then dump that all at once:
output = []
for f in files:
# ...
output.append(data)
json.dump(output, outfile)
如果内存问题,您可以选择以下几种方式:
If memory is an issue, you have a few choices:
- 对于快速解决方案,您可以通过手动编写
[
,,
和]
来伪造它. (但请注意,即使某些解码器会接受,但在最后一个值之后加上多余的逗号也不是有效的JSON.) - 您可以将循环包裹在生成器函数中,该函数生成每个
data
,并扩展JSONEncoder
以将迭代器转换为数组. (请注意,这实际上是文档中的示例为什么和如何扩展JSONEncoder
,尽管您可能想编写一种内存效率更高的实现.) - 您可以寻找具有某种内置迭代流API的第三方JSON库.
- For a quick-and-dirty solution, you can fake it by writing the
[
,,
, and]
manually. (But note that it is not valid JSON to have an extra trailing comma after the last value, even if some decoders will accept it.) - You can wrap your loop up in a generator function that yields each
data
, and extendJSONEncoder
to convert iterators to arrays. (Note that this is actually used as the example in the docs of why and how to extendJSONEncoder
, although you might want to write a more memory-efficient implementation.) - You can look for a third-party JSON library that has some kind of built-in iterative streaming API.
但是,值得考虑您要尝试做的事情.也许单独的JSON对象流实际上 是您要执行的操作的正确文件格式/协议/API.因为JSON是自定界的,所以实际上没有理由在单独的值之间添加定界符. (并且它甚至对鲁棒性没有多大帮助,除非您使用一个不会在整个JSON上显示的定界符,如,
一样.)例如,您得到的正是JSON -RPC应该看起来像.如果您因为不知道如何解析这样的文件而只是要求不同的东西,那很容易.例如(为简单起见,使用字符串而不是文件):
However, it's worth considering what you're trying to do. Maybe a stream of separate JSON objects actually is the right file format/protocol/API for what you're trying to do. Because JSON is self-delimiting, there's really no reason to add a delimiter between separate values. (And it doesn't even help much with robustness, unless you use a delimiter that isn't going to show up all over the actual JSON, as ,
is.) For example, what you've got is exactly what JSON-RPC is supposed to look like. If you're just asking for something different because you don't know how to parse such a file, that's pretty easy. For example (using a string rather than a file for simplicity):
i = 0
d = json.JSONDecoder()
while True:
try:
obj, i = d.raw_decode(s, i)
except ValueError:
return
yield obj
这篇关于带有json.dump的JSON对象之间的逗号分隔符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!