如何通过密钥对JSON进行分组并按其计数排序? [英] How to group a JSON by a key and sort by its count?
问题描述
我从与此类似的jsonlines文件开始
I start from a jsonlines file similar to this
{ "kw": "foo", "age": 1}
{ "kw": "foo", "age": 1}
{ "kw": "foo", "age": 1}
{ "kw": "bar", "age": 1}
{ "kw": "bar", "age": 1}
请注意,每行都是有效的json,但整个文件不是.
Please note each line is a valid json, but the whole file is not.
我要搜索的输出是按关键字的出现顺序排序的关键字的有序列表.像这样:
The output I'm seeking is an ordered list of keywords sorted by its occurrence. Like this:
[
{"kw": "foo", "count": 3},
{"kw": "bar", "count": 2}
]
我可以使用 slurp
选项
jq --slurp '. | group_by(.kw) | .[] | {kw: .[0].kw, count: . | length }'
输出:
{"kw":"bar","count":2}
{"kw":"foo","count":3}
但是:
- 未排序
- 这不是有效的JSON数组
我发现一个非常愚蠢的解决方案是通过 jq
:)
A very stupid solution I've found, is to pass twice via jq
:)
jq --slurp --compact-output '. | group_by(.kw) | .[] | {kw: .[0].kw, count: . | length }' sample.json \
| jq --slurp --compact-output '. | sort_by(.count)'
但是我敢肯定,比我聪明的人可以找到一个更优雅的解决方案.
But I'm pretty sure someone smarter than me can find a more elegant solution.
推荐答案
未排序
那不是很正确, group_by(.foo)
在内部执行 sort(.foo)
,因此结果以字段的排序顺序显示.参见 jq
手册-group_by(path_expression)
That is not quite correct, group_by(.foo)
internally does a sort(.foo)
, so the results are shown in the sorted order of the field. See jq
Manual - group_by(path_expression)
这不是有效的JSON数组
This is not valid JSON array
只需将操作括在 [..]
中,并且前导.
是可选的.所以就做
Just enclose the operation within [..]
and also the leading .
is optional. So just do
jq --slurp --compact-output '[ group_by(.kw)[] | {kw: .[0].kw, count: length } ]'
如果您是指按 .count
进行排序,则可以进行升序排序和反转
If you are referring to sort by the .count
you can do a ascending sort and reverse
jq --slurp --compact-output '[ group_by(.kw)[] | {kw: .[0].kw, count: length }] | sort_by(.count) | reverse'
这篇关于如何通过密钥对JSON进行分组并按其计数排序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!