使用jq将具有数组的JSON转换为CSV [英] Converting JSON with arrays to CSV using jq

查看:92
本文介绍了使用jq将具有数组的JSON转换为CSV的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我发现自己处在JSON领域,并且尝试使用jq将其转换为JSON.我正在尝试将以下结构转换为CSV:

I found myself in the world of JSON and I'm trying to convert out of it using jq. I'm trying to convert following structure to CSV:

{
  "Action": "A1",
  "Group": [
    {
      "Id": "10",
      "Units": [
        "1"
      ]
    }
  ]
}
{
  "Action": "A2",
  "Group": [
    {
      "Id": "11",
      "Units": [
        "2"
      ]
    },
    {
      "Id": "20",
      "Units": []
    }
  ]
}
{
  "Action": "A1",
  "Group": [
    {
      "Id": "26",
      "Units": [
        "1",
        "3"
      ]
    }
  ]
}
{
  "Action": "A3",
  "Group": null
}

其中ID在10-99和1-5单元之间.预期的输出将是(带引号或不带引号,逗号分隔与否,为清楚起见,我使用了管道分隔符):

where the Ids are between 10-99 and Units 1-5. Expected output would be (quoted or unquoted, comma separated or not, I used pipe separators for clarity):

Action|Group|Unit1|Unit2|Unit3|Unit4|Unit5
A1|10|1|0|0|0|0
A2|11|0|1|0|0|0
A2|20|0|0|0|0|0
A1|26|1|0|1|0|0
A3|0|0|0|0|0|0

我已经玩了一段时间(history | grep jq | wc -l说107),但是在将键彼此组合在一起方面还没有取得任何实质性进展,我基本上只是获得了键列表(jq n00b) ).

I've played around with this for a while now (history | grep jq | wc -l says 107) but haven't made any real progress to combining the keys with eachother, I'm basically just getting lists of keys (jq n00b).

更新:

测试解决方案(对不起,有点麻烦了),我注意到数据中也有带有"Group": null s的记录,即:

Testing the solution (sorry, been a bit s l o w) I noticed that the data also has records with "Group": nulls, ie.:

{
  "Action": "A3",
  "Group": null
}

(在主测试数据集中添加了几行),导致错误:jq: error (at file.json:61): Cannot iterate over null (null).预期输出为:

(above few lines added to the main test data set) which results in error: jq: error (at file.json:61): Cannot iterate over null (null). Expected output would be:

A3|0|0|0|0

有没有一种简便的方法?

Is there an easy way out of that one?

推荐答案

如果事先不知道单位列的集合,则为通用解决方案:

Here is a general solution if the set of unit columns isn't known in advance:

def normalize: [            # convert input to array of flattened objects e.g. 
      inputs                # [{"Action":"A1","Group":"10","Unit1":"1"}, ...]
    | .Action as $a
    | .Group[]
    |   {Action:$a, Group:.Id}
      + reduce .Units[] as $u ({};.["Unit\($u)"]="1")
  ];

def columns:                # compute column names
  [ .[] | keys[] ] | unique ;

def rows($names):           # generate row arrays
    .[] | [ .[$names[]] ] | map( .//"0" );

normalize | columns as $names | $names, rows($names) | join("|")

样本运行(假设filter.jq中的过滤器和data.json中的数据)

Sample Run (assumes filter in filter.jq and data in data.json)

$ jq -Mnr -f filter.jq data.json
Action|Group|Unit1|Unit2|Unit3
A1|10|1|0|0
A2|11|0|1|0
A2|20|0|0|0
A1|26|1|0|1

在线上尝试!

在此特定问题中,unique完成的排序与我们想要的列输出匹配.如果不是这种情况,columns将会更加复杂.

In this specific problem the ordering done by unique matches the column output we want. If that were not the case columns would be more complicated.

很多复杂性来自于不了解最终的Unit列集.如果单位组是固定的并且相当小(例如1-5),则可以使用更简单的过滤器:

Much of the complexity comes from dealing with not knowing the final set of Unit columns. If the unit set is fixed and reasonably small (e.g. 1-5) a simpler filter can be used:

  ["\(1+range(5))"] as $units
| ["Action", "Group", "Unit\($units[])"]
, ( inputs 
  | .Action as $a 
  | .Group[] 
  | [$a, .Id, (.Units[$units[]|[.]] | if .!=[] then "1" else "0" end) ]
) | join("|")

样品运行

$ jq -Mnr '["\(1+range(5))"] as $units | ["Action", "Group", "Unit\($units[])"], (inputs | .Action as $a | .Group[] | [$a, .Id, (.Units[$units[]|[.]] | if .!=[] then "1" else "0" end) ] ) | join("|")' data.json
Action|Group|Unit1|Unit2|Unit3|Unit4|Unit5
A1|10|1|0|0|0|0
A2|11|0|1|0|0|0
A2|20|0|0|0|0|0
A1|26|1|0|1|0|0

在tio.run jqplay.org 上在线尝试.

要处理Group可能是null的情况,最简单的方法是使用 peak 的变体的建议.例如

To handle the case where Group may be null the easiest way is to use a variation of peak's suggestion. E.g

  ["\(1+range(5))"] as $units
| ["Action", "Group", "Unit\($units[])"]
, ( inputs 
  | .Action as $a 
  | ( .Group // [{Id:"0", Units:[]}] )[]   # <-- supply default group if null
  | [$a, .Id, (.Units[$units[]|[.]] | if .!=[] then "1" else "0" end) ]
) | join("|")

在tio.run上尝试 jqplay.org

这篇关于使用jq将具有数组的JSON转换为CSV的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆