在Bash或Fish shell中使用JQ拆分/分块JSON文件? [英] Splitting / chunking JSON files with JQ in Bash or Fish shell?
问题描述
我一直在使用很棒的 JQ库来解析和提取JSON数据,以方便重新导入.我能够很容易地提取范围,但是不确定如何在脚本中循环并检测文件的结尾,最好是在bash或fish shell脚本中.
I have been using the wonderful JQ library to parse and extract JSON data to facilitate re-importing. I am able to extract a range easily enough, but am unsure as to how you could loop through in a script and detect the end of the file, preferably in a bash or fish shell script.
给出一个包装在结果" 词典中的JSON文件,如何检测文件的结尾?
Given a JSON file that is wrapped in a "results" dictionary, how can I detect the end of the file?
从测试中,我可以看到一个空数组嵌套在所需的结构中,但是如何检测文件条件的结束?:
From testing, I can see that I will get an empty array nested in my desired structure, but how can you detect the end of file condition?:
jq '{ "results": .results[0:500] }' Foo.json > 0000-0500/Foo.json
谢谢!
推荐答案
我建议使用jq将数组拆分为所需的JSON对象流(每行一个),然后使用其他一些工具(例如awk)来填充文件.这是第一部分的完成方式:
I'd recommend using jq to split-up the array into a stream of the JSON objects you want (one per line), and then using some other tool (e.g. awk) to populate the files. Here's how the first part can be done:
def splitup(n):
def _split:
if length == 0 then empty
else .[0:n], (.[n:] | _split)
end;
if n == 0 then empty elif n > 0 then _split else reverse|splitup(-n) end;
# For the sake of illustration:
def data: { results: [range(0,20)]};
data | .results | {results: splitup(5) }
调用:
$ jq -nc -f splitup.jq
{"results":[0,1,2,3,4]}
{"results":[5,6,7,8,9]}
{"results":[10,11,12,13,14]}
{"results":[15,16,17,18,19]}
对于第二部分,您可以(例如)将jq输出通过管道传输到:
For the second part, you could (for example) pipe the jq output to:
awk '{ file="file."++n; print > file; close(file); }'
您可能感兴趣的一种变体是让jq过滤器在交替的行上同时发出文件名和JSON;然后,awk脚本也会读取文件名.
A variant you might be interested in would have the jq filter emit both the filename and the JSON on alternate lines; the awk script would then read the filename as well.
这篇关于在Bash或Fish shell中使用JQ拆分/分块JSON文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!