根据特定约束使用jq转换json [英] Convert json using jq based on specific constraints

查看:80
本文介绍了根据特定约束使用jq转换json的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个json文件'OpenEnded_mscoco_val2014.json'.json文件包含121512个问题.
这是一些示例:

I have a json file 'OpenEnded_mscoco_val2014.json'.The json file contains 121,512 questions.
Here is some sample :

"questions": [
{
  "question": "What is the table made of?",
  "image_id": 350623,
  "question_id": 3506232
},
{
  "question": "Is the food napping on the table?",
  "image_id": 350623,
  "question_id": 3506230
},
{
  "question": "What has been upcycled to make lights?",
  "image_id": 350623,
  "question_id": 3506231
},
{
  "question": "Is this an Spanish town?",
  "image_id": 8647,
  "question_id": 86472
}

]

我用jq -r '.questions | [map(.question), map(.image_id), map(.question_id)] | @csv' OpenEnded_mscoco_val2014_questions.json >> temp.csv将json转换为csv.
但是在csv中输出的是问题,后面是image_id,这是上面的代码所做的.
预期的输出是:

I used jq -r '.questions | [map(.question), map(.image_id), map(.question_id)] | @csv' OpenEnded_mscoco_val2014_questions.json >> temp.csv to convert json into csv.
But here output in csv is question followed by image_id which is what above code does.
The expected output is :

"What is table made of",350623,3506232
"Is the food napping on the table?",350623,3506230

是否还可以仅过滤具有image_id <= 10000的结果并过滤到group questions having same image_id?例如json的1,2,3结果可以合并为3个问题,其中1个image_id,3个Question_id.

Also is it possible to filter only results havingimage_id <= 10000 and to group questions having same image_id? e.g. 1,2,3 result of json can be combined to have 3 questions, 1 image_id, 3 question_id.

第一个问题由possible duplicate question解决.我想知道是否有可能在jq的命令行上调用比较运算符以转换json文件.在这种情况下,仅从image_id <= 10000获取json中的所有字段.

EDIT : The first problem is solved by possible duplicate question.I would like to know if is it possible to invoke comparison operator on command line in jq for converting json file. In this case get all fields from json if image_id <= 10000 only.

推荐答案

1)给定您的输入(精心构造以使其成为有效的JSON),以下查询将生成CSV输出,如下所示:

1) Given your input (suitably elaborated to make it valid JSON), the following query generates the CSV output as shown:

$ jq -r '.questions[] | [.question, .image_id, .question_id] | @csv'

"What is the table made of?",350623,3506232
"Is the food napping on the table?",350623,3506230
"What has been upcycled to make lights?",350623,3506231
"Is this an Spanish town?",8647,86472

这里要记住的关键是@csv需要一个平面数组,但是与所有jq过滤器一样,您可以向其提供流.

The key thing to remember here is that @csv requires a flat array, but as with all jq filters, you can feed it a stream.

2)要使用标准.image_id <= 10000进行过滤,只需插入适当的select/1过滤器:

2) To filter using the criterion .image_id <= 10000, just interpose the appropriate select/1 filter:

.questions[]
| select(.image_id <= 10000)
| [.question, .image_id, .question_id]
| @csv

3)要按image_id排序,请使用sort_by(.image_id)

3) To sort by image_id, use sort_by(.image_id)

.questions
| sort_by(.image_id)
|.[]
| [.question, .image_id, .question_id]
| @csv

4)要按.image_id分组,请将以下管道的输出通过管道传递到您自己的管道中:

4) To group by .image_id you would pipe the output of the following pipeline into your own pipeline:

.questions | group_by(.image_id)

但是,您将必须准确决定要如何组合分组的对象.

You will, however, have to decide exactly how you want to combine the grouped objects.

这篇关于根据特定约束使用jq转换json的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆