使用jq,展平任意JSON到分隔符分隔的展平字典 [英] Using jq, Flatten Arbitrary JSON to Delimiter-Separated Flat Dictionary

查看:149
本文介绍了使用jq,展平任意JSON到分隔符分隔的展平字典的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望使用jq将JSON转换为以分隔符分隔和展平的结构.

I'm looking to transform JSON using jq to a delimiter-separated and flattened structure.

已尝试过此操作.例如,使用jq扁平嵌套JSON .

There have been attempts at this. For example, Flatten nested JSON using jq.

但是,如果JSON包含数组,则该页面上的解决方案将失败.例如,如果JSON是:

However the solutions on that page fail if the JSON contains arrays. For example, if the JSON is:

{"a":{"b":[1]},"x":[{"y":2},{"z":3}]}

上述解决方案无法将以上内容转换为:

The solution above will fail to transform the above to:

{"a.b.0":1,"x.0.y":2,"x.1.z":3}

此外,我正在寻找一种解决方案,该解决方案也应允许使用任意定界符.例如,假设空格字符是定界符.在这种情况下,结果将是:

In addition, I'm looking for a solution that will also allow for an arbitrary delimiter. For example, suppose the space character is the delimiter. In this case, the result would be:

{"a b 0":1,"x 0 y":2,"x 1 z":3}

我希望通过CashOS 7中的Bash(4.2+)函数来访问此功能,如下所示:

I'm looking to have this functionality accessed via a Bash (4.2+) function as is found in CentOS 7, something like this:

flatten_json()
{
    local JSONData="$1"
    # jq command to flatten $JSONData, putting the result to stdout
    jq ... <<<"$JSONData"
}

该解决方案应适用于所有JSON数据类型,包括 null boolean .例如,考虑以下输入:

The solution should work with all JSON data types, including null and boolean. For example, consider the following input:

{"a":{"b":["p q r"]},"w":[{"x":null},{"y":false},{"z":3}]}

它应该产生:

{"a b 0":"p q r","w 0 x":null,"w 1 y":false,"w 2 z":3}

推荐答案

如果输入数据流,则将获得所有叶值的路径和值对.如果不是一对,则在该路径处标记对象/数组定义结束的路径.如您所见,使用leaf_paths只会给您提供真实叶子值的路径,因此您会错过null甚至false值.作为流,您不会遇到此问题.

If you stream the data in, you'll get pairings of paths and values of all leaf values. If not a pair, then a path marking the end of a definition of an object/array at that path. Using leaf_paths as you found would only give you paths to truthy leaf values so you'll miss out on null or even false values. As a stream, you won't get this problem.

可以通过多种方式将其组合到一个对象上,我偏爱在这种情况下使用reduce和赋值.

There are many ways this could be combined to an object, I'm partial to using reduce and assignment in these situations.

$ cat input.json
{"a":{"b":["p q r"]},"w":[{"x":null},{"y":false},{"z":3}]}

$ jq --arg delim '.' 'reduce (tostream|select(length==2)) as $i ({};
    .[[$i[0][]|tostring]|join($delim)] = $i[1]
)' input.json
{
  "a.b.0": "p q r",
  "w.0.x": null,
  "w.1.y": false,
  "w.2.z": 3
}


同一解决方案在这里进行了细分,以留出空间来解释正在发生的事情.


Here's the same solution broken up a bit to allow room for explanation of what's going on.

$ jq --arg delim '.' 'reduce (tostream|select(length==2)) as $i ({};
    [$i[0][]|tostring] as $path_as_strings
        | ($path_as_strings|join($delim)) as $key
        | $i[1] as $value
        | .[$key] = $value
)' input.json

使用tostream将输入转换为流,我们将接收成对/路径的多个值作为过滤器的输入.这样,我们可以将这些多个值传递到reduce中,该reduce旨在接受多个值并对它们执行某些操作.但是在我们这样做之前,我们只想按对(select(length==2))过滤那些对/路径.

Converting the input to a stream with tostream, we'll receive multiple values of pairs/paths as input to our filter. With this, we can pass those multiple values into reduce which is designed to accept multiple values and do something with them. But before we do, we want to filter those pairs/paths by only the pairs (select(length==2)).

然后在reduce调用中,我们从一个干净的对象开始,并使用从路径派生的键和相应的值分配新值.请记住,在reduce调用中产生的每个值都将用于迭代中的下一个值.将值绑定到变量不会更改当前上下文,分配会有效地修改"当前值(初始对象)并将其传递.

Then in the reduce call, we're starting with a clean object and assigning new values using a key derived from the path and the corresponding value. Remember that every value produced in the reduce call is used for the next value in the iteration. Binding values to variables doesn't change the current context and assignments effectively "modify" the current value (the initial object) and passes it along.

$path_as_strings只是路径,它是由字符串和数字组成的数组,仅是字符串. [$i[0][]|tostring]是我要映射的数组不是当前数组时的一种速记方式,可以代替map使用.由于映射是作为单个表达式完成的,因此这更加紧凑.这样不必获得相同的结果:($i[0]|map(tostring)).通常,外部括号不是必需的,但是仍然是两个单独的过滤器表达式,而不是一个(和更多文本).

$path_as_strings is just the path which is an array of strings and numbers to just strings. [$i[0][]|tostring] is a shorthand I use as an alternative to using map when the array I want to map is not the current array. This is more compact since the mapping is done as a single expression. That instead of having to do this to get the same result: ($i[0]|map(tostring)). The outer parentheses might not be necessary in general but, it's still two separate filter expressions vs one (and more text).

然后从那里使用提供的定界符将该字符串数组转换为所需的键.然后将适当的值分配给当前对象.

Then from there we convert that array of strings to the desired key using the provided delimiter. Then assign the appropriate values to the current object.

这篇关于使用jq,展平任意JSON到分隔符分隔的展平字典的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆