如何使用JQ将对象列表展开为非规范化对象? [英] How to use JQ to unroll a list of objects into denormalized objects?

查看:59
本文介绍了如何使用JQ将对象列表展开为非规范化对象?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下JSON行示例:

I have the following JSON lines example:

{"toplevel_key": "top value 1", "list": [{"key1": "value 1", "key2": "value 2"},{"key1": "value 3", "key2": "value 4"}]}
{"toplevel_key": "top value 2", "list": [{"key1": "value 5", "key2": "value 6"}]}

我想使用JQ对其进行转换,将列表展开为固定数量的列",最后以以下格式显示平面JSON对象列表:

I want convert it using JQ, unrolling the list to a fixed number of "columns", ending up with a list of flat JSON objects, with the following format:

{
    "top-level-key": "top value 1",
    "list_0_key1": "value 1",
    "list_0_key2": "value 2",
    "list_1_key1": "value 3",
    "list_1_key2": "value 4",
}
{
    "top-level-key": "top value 2",
    "list_0_key1": "value 4",
    "list_0_key2": "value 5",
    "list_1_key1": "",
    "list_1_key2": "",
}

注意:实际上,我希望它们每行一个,此处的格式便于阅读.

我能够获得所需输出的唯一方法是通过写出我的JQ表达式中的所有列:

The only way I was able to get the output I want was by writing out all the columns in my JQ expression:

$ cat example.jsonl | jq -c '{toplevel_key, list_0_key1: .list[0].key1, list_0_key2: .list[0].key2, list_1_key1: .list[1].key1, list_1_key2: .list[1].key2}'

这给了我想要的结果,但是我必须手动编写所有固定的列"(在生产中,它会远远超过这个).

This gets me the result that I want, but I have to write manually ALL the fixed "columns" (and in production it will be a lot more than that).

我知道我可以使用脚本来生成该JQ代码,但是我对这样的解决方案感兴趣–它不能解决我的问题,因为这是针对应用程序的仅接受JQ.

I know I could use a script to generate that JQ code, but I'm NOT interested in a solution like that -- it won't solve my problem, because this is for an application that accepts only JQ.

有没有办法在纯JQ中做到这一点?

Is there a way to do it in pure JQ?

这是到目前为止我能做到的:

This is what I was able to get so far:

$ cat example.jsonl | jq -c '(.list | to_entries | map({("list_" + (.key | tostring)): .value})) | add'
{"list_0":{"key1":"value 1","key2":"value 2"},"list_1":{"key1":"value 3","key2":"value 4"}}
{"list_0":{"key1":"value 5","key2":"value 6"}}

推荐答案

只要您知道特定键的名称,Jeff的答案就很好.这是一个不对特定键名进行硬编码的答案,也就是说,它可以与任何结构和嵌套级别的对象一起使用:

As long as you know the names of the specific keys, Jeff's answer is great. Here's an answer that doesn't hardcode the specific key names, that is, it works with objects of any structure and levels of nesting:

[leaf_paths as $path | {
    "key": $path | map(tostring) | join("_"),
    "value": getpath($path)
}] | from_entries

一个解释:paths是一个内置函数,它以递归方式输出一个数组,该数组表示传递给它的输入中每个元素的位置:所述数组中的元素是有序的键名和索引,这些键名和索引导致了所请求的数组元素. leaf_paths是它的一个版本,仅获取叶"元素(即不包含其他元素的元素)的路径.

An explanation: paths is a builtin function that outputs an array representing the position of each element of the input you pass to it, recursively: the elements in said array are the ordered key names and indexes that lead to the requested array element. leaf_paths is a version of it that only gets the paths to the "leaf" elements, that is, elements that do not contain other elements.

为澄清起见,给定输入[[1, 2]]paths将输出[0], [0, 0], [0, 1](即分别指向[1, 2]12的路径),而leaf_paths仅输出[0, 0], [0, 1].

To clarify, given the input [[1, 2]], paths will output [0], [0, 0], [0, 1] (that is, the paths to [1, 2], 1 and 2, respectively) while leaf_paths will only output [0, 0], [0, 1].

那是最难的部分.之后,我们得到每个路径,因为$path(格式为["list", 1, "key2"])将每个元素转换为带有map(tostring)(为我们提供["list", "1", "key2"])的字符串表示形式,并为join加上下划线.我们将其保留为要创建的对象中条目"的键:作为值,我们在给定的$path处获得原始对象的值.

That's the hardest part. After that, we get each of the paths as $path (of the form ["list", 1, "key2"]) convert each of its elements to its string representation with map(tostring) (which gives us ["list", "1", "key2"]) and join them with underscores. We keep this as the key of the "entry" in the object we want to create: as value, we get the value of the original object at the $path given.

最后,我们使用from_entries将键值对数组转换为JSON对象.这将为我们提供类似于Jeff回答的输出:即,仅显示具有值的键.

Lastly, we use from_entries to turn an array of key-value pairs into a JSON object. This will give us an output similar to the one on Jeff's answer: that is, one in which only keys with values appear.

但是,您的原始问题要求值出现在所有输入对象中,以出现在所有输出中,并且当输入中缺少相应值时,其对应值设置为空字符串.这是一个执行此操作的jq程序:正如Jeff在他的回答中所说,您需要对所有输入值进行勾号(-s)才能实现:

However, your original question requested values appearing on any of the input objects to appear in all of the outputs, with the corresponding values set to empty strings when missing on the input. Here's a jq program that does this: as Jeff says in his answer, you need to slurp (-s) all the input values for it to be possible:

(map(leaf_paths) | unique) as $paths |
map([$paths[] as $path | {
    "key": $path | map(tostring) | join("_"),
    "value": (getpath($path) // "")
}] | from_entries)[]

您会注意到,它与第一个程序非常相似:主要区别是我们在浆状对象中获得所有唯一路径为$paths,并且对于每个对象,我们都通过这些对象而不是通过该对象.我们还使用替代运算符(//)将缺少的值设置为空字符串.

You'll notice that it's pretty similar to the first program: the main difference is that we get all unique paths in the slurped object as $paths, and for each object we go through those instead of going through the paths of that object. We also use the alternative operator (//) to set missing values to empty strings.

希望这会有所帮助!

这篇关于如何使用JQ将对象列表展开为非规范化对象?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆