使用jq将具有数组的JSON转换为CSV [英] Converting JSON with arrays to CSV using jq
问题描述
我发现自己处在JSON领域,并且尝试使用jq
将其转换为JSON.我正在尝试将以下结构转换为CSV:
I found myself in the world of JSON and I'm trying to convert out of it using jq
. I'm trying to convert following structure to CSV:
{
"Action": "A1",
"Group": [
{
"Id": "10",
"Units": [
"1"
]
}
]
}
{
"Action": "A2",
"Group": [
{
"Id": "11",
"Units": [
"2"
]
},
{
"Id": "20",
"Units": []
}
]
}
{
"Action": "A1",
"Group": [
{
"Id": "26",
"Units": [
"1",
"3"
]
}
]
}
{
"Action": "A3",
"Group": null
}
其中ID在10-99和1-5单元之间.预期的输出将是(带引号或不带引号,逗号分隔与否,为清楚起见,我使用了管道分隔符):
where the Ids are between 10-99 and Units 1-5. Expected output would be (quoted or unquoted, comma separated or not, I used pipe separators for clarity):
Action|Group|Unit1|Unit2|Unit3|Unit4|Unit5
A1|10|1|0|0|0|0
A2|11|0|1|0|0|0
A2|20|0|0|0|0|0
A1|26|1|0|1|0|0
A3|0|0|0|0|0|0
我已经玩了一段时间(history | grep jq | wc -l
说107),但是在将键彼此组合在一起方面还没有取得任何实质性进展,我基本上只是获得了键列表(jq
n00b) ).
I've played around with this for a while now (history | grep jq | wc -l
says 107) but haven't made any real progress to combining the keys with eachother, I'm basically just getting lists of keys (jq
n00b).
更新:
测试解决方案(对不起,有点麻烦了),我注意到数据中也有带有"Group": null
s的记录,即:
Testing the solution (sorry, been a bit s l o w) I noticed that the data also has records with "Group": null
s, ie.:
{
"Action": "A3",
"Group": null
}
(在主测试数据集中添加了几行),导致错误:jq: error (at file.json:61): Cannot iterate over null (null)
.预期输出为:
(above few lines added to the main test data set) which results in error: jq: error (at file.json:61): Cannot iterate over null (null)
. Expected output would be:
A3|0|0|0|0
有没有一种简便的方法?
Is there an easy way out of that one?
推荐答案
如果事先不知道单位列的集合,则为通用解决方案:
Here is a general solution if the set of unit columns isn't known in advance:
def normalize: [ # convert input to array of flattened objects e.g.
inputs # [{"Action":"A1","Group":"10","Unit1":"1"}, ...]
| .Action as $a
| .Group[]
| {Action:$a, Group:.Id}
+ reduce .Units[] as $u ({};.["Unit\($u)"]="1")
];
def columns: # compute column names
[ .[] | keys[] ] | unique ;
def rows($names): # generate row arrays
.[] | [ .[$names[]] ] | map( .//"0" );
normalize | columns as $names | $names, rows($names) | join("|")
样本运行(假设filter.jq
中的过滤器和data.json
中的数据)
Sample Run (assumes filter in filter.jq
and data in data.json
)
$ jq -Mnr -f filter.jq data.json
Action|Group|Unit1|Unit2|Unit3
A1|10|1|0|0
A2|11|0|1|0
A2|20|0|0|0
A1|26|1|0|1
在此特定问题中,unique
完成的排序与我们想要的列输出匹配.如果不是这种情况,columns
将会更加复杂.
In this specific problem the ordering done by unique
matches the column output we want. If that were not the case columns
would be more complicated.
很多复杂性来自于不了解最终的Unit列集.如果单位组是固定的并且相当小(例如1-5),则可以使用更简单的过滤器:
Much of the complexity comes from dealing with not knowing the final set of Unit columns. If the unit set is fixed and reasonably small (e.g. 1-5) a simpler filter can be used:
["\(1+range(5))"] as $units
| ["Action", "Group", "Unit\($units[])"]
, ( inputs
| .Action as $a
| .Group[]
| [$a, .Id, (.Units[$units[]|[.]] | if .!=[] then "1" else "0" end) ]
) | join("|")
样品运行
$ jq -Mnr '["\(1+range(5))"] as $units | ["Action", "Group", "Unit\($units[])"], (inputs | .Action as $a | .Group[] | [$a, .Id, (.Units[$units[]|[.]] | if .!=[] then "1" else "0" end) ] ) | join("|")' data.json
Action|Group|Unit1|Unit2|Unit3|Unit4|Unit5
A1|10|1|0|0|0|0
A2|11|0|1|0|0|0
A2|20|0|0|0|0|0
A1|26|1|0|1|0|0
在tio.run 或 jqplay.org 上在线尝试.
要处理Group
可能是null
的情况,最简单的方法是使用 peak 的变体的建议.例如
To handle the case where Group
may be null
the easiest way is to use a variation of peak's suggestion. E.g
["\(1+range(5))"] as $units
| ["Action", "Group", "Unit\($units[])"]
, ( inputs
| .Action as $a
| ( .Group // [{Id:"0", Units:[]}] )[] # <-- supply default group if null
| [$a, .Id, (.Units[$units[]|[.]] | if .!=[] then "1" else "0" end) ]
) | join("|")
这篇关于使用jq将具有数组的JSON转换为CSV的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!