JQ:为输入的子集计算每组对象的数量 [英] JQ: count number of objects per group, for a subset of input

查看:96
本文介绍了JQ:为输入的子集计算每组对象的数量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要用JQ计算每个组中的对象数,但仅适用于N个最新对象.

I need to count number of objects in each group with JQ, but only for N most recent objects.

样本输入,N = 3:

Sample input, for N=3:

{"modified":"Mon Sep 25 14:20:00 +0000 2018","object_id":1,"group_id":"C"}
{"modified":"Mon Sep 25 14:23:00 +0000 2018","object_id":2,"group_id":"A"}
{"modified":"Mon Sep 25 14:21:00 +0000 2018","object_id":3,"group_id":"B"}
{"modified":"Mon Sep 25 14:22:00 +0000 2018","object_id":4,"group_id":"A"}

预期输出:

{"A",2}
{"B",1}


我什至没有选择一个可以保留对象结构的基于日期的子集:这是我设法实现的最好结果:


I'm failing even to select a date-based subset which will preserve the structure of the objects: this is the best I managed to achieve:

 [
   .modified |= strptime("%a %b %d %H:%M:%S %z %Y") |
   .modified |= mktime |
   .modified |= strftime("%Y-%m-%d %H:%M:%S")
 ]  |
 sort_by(.modified) |
 .[] |
 {modified, object_id, group_id}

由于某些原因,结果仍未排序.

For some reason, results are still unsorted.

我也未能将此类列表转换为仅选择N个最新条目的数组.

I'm also failing to convert such a list to an array to select only N most recent entries.

然后,我需要以某种方式计算每个组中的对象数.

And after that I will need to count number of objects per group in some way.

总体而言,对于对象的数组和列表如何相互转换以及如何修改其某些字段,然后仅提取所需的字段,我似乎需要一个非常直观的解释.不幸的是,到目前为止,我发现的教程并没有帮助.

Overall, looks like I need an extremely intuitive explanation on how arrays and lists of objects convert to each other, and how to modify some of their fields and, after that, to extract only fields required. The tutorials I've found so far did not help, unfortunately.

推荐答案

假设您的输入文件为:

cat file
{"modified":"Mon Sep 25 14:20:00 +0000 2018","object_id":1,"class_id":"C"}
{"modified":"Mon Sep 25 14:23:00 +0000 2018","object_id":2,"class_id":"A"}
{"modified":"Mon Sep 25 14:21:00 +0000 2018","object_id":3,"class_id":"B"}
{"modified":"Mon Sep 25 14:22:00 +0000 2018","object_id":4,"class_id":"A"}

您可以尝试以下操作:

<file jq -s '
   [ .[] | 
     (.modified |= (strptime("%a %b %d %H:%M:%S +0000 %Y") | mktime)) 
   ] | 
   sort_by(.modified) |              # sort using converted time
   .[-3:] |                          # take the last 3
   group_by(.class_id) |             # group ids together
   .[] |                             
   {(.[0].class_id): length}'        # create the object using the id name and table length
{
   "A": 2
}
{
  "B": 1
}

请注意,在我的系统上,strptime的选项%z不起作用.因此,我将其替换为+0000(无论如何在时间转换中都没有使用).

Note that on my system, the option %z of strptime isn't working. So I replaced it with +0000 (which is anyway not used in the time conversion).

这篇关于JQ:为输入的子集计算每组对象的数量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆