在jq中取消嵌套对象时,如何避免在管道的每个阶段重设标签? [英] While unnesting an object in jq, how can I avoid restating labels at each stage in the pipeline?

查看:29
本文介绍了在jq中取消嵌套对象时,如何避免在管道的每个阶段重设标签?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

摘要:

我已经成功地弄清了如何取消jq中对象的嵌套;但是,我编写的工作代码需要大量重复.我觉得可能有一种更简洁或更简洁的方法来实现相同的结果,我想知道它是什么.

示例:

使用以下嵌套的公司结构,假设目标是为列出的每个提取名称,ID,公司和站点.(我们可以忽略该地址.)

输入:

  {"company":"Initrode",网站":[{"name":"HQ",地址":"123 Main Street",人员": [{"name":"John Smith","UID":12345},{名称":简·多伊(Jane Doe)","UID":23456}]},{名称":分支机构","address":"Spodunk,Nowhereville",人员": [{"name":"Fred Anderson","UID":56789},{名称":比尔·琼斯(Bill Jones)","UID":34567}]}]}{"company":"Inittech",网站":[{名称":总公司","address":"5678 Avenue Blvd",人员": [{名称":弗雷德·约翰逊","UID":6543},{"name":"James Fredson","UID":9876}]},{"name":"Testing Station","address":阿拉斯加原野",人员": [{"name":"Sally May","UID":5432},{名称":杰克·詹姆斯","UID":8765}]}]} 

工作代码:

  jq'{company,site:.sites []} ||{company,site:.site.name,personnel:.site.personnel []} |{name:.personnel.name,id:.personnel.UID,company,site}'sample.json 

正确的输出:

  {"name":"John Smith","id":12345,"company":"Initrode","site":"HQ"}{名称":简·多伊(Jane Doe)","id":23456,"company":"Initrode","site":"HQ"}{"name":"Fred Anderson","id":56789,"company":"Initrode",站点":分支机构"}{名称":比尔·琼斯(Bill Jones)","id":34567,"company":"Initrode",站点":分支机构"}{名称":弗雷德·约翰逊","id":6543,"company":"Inittech",站点":总公司"}{"name":"James Fredson","id":9876,"company":"Inittech",站点":总公司"}{"name":"Sally May","id":5432,"company":"Inittech",站点":测试站"}{名称":杰克·詹姆斯","id":8765,"company":"Inittech",站点":测试站"} 

问题:

这里涉及很多重复.除了在管道的每个阶段重复外部标签之外,还在管道的第二部分和第三部分分别重复 .site .personnel .

我的真实数据要复杂得多,所以这种重复甚至更糟,更难读取.

顺便说一下,下面是我为实现上述相同目标而尝试的一些非工作代码:

  jq'{company,site:.sites [].name,name:.sites [].personnel [].name,id:.sites [].personnel [].UID}'sample.json 

重复的次数要少得多,但是不幸的是,它返回与公司每个ID和站点相关联的每个人-错误的结果,例如数据库交叉连接"而不是内部连接".

我不太了解如何用文字描述此处需要的内容,但希望上面的示例可以帮助您弄清楚.

一种描述它的方法是,我试图将子对象数组中的多个名称/值对合并到顶级对象中,而不将来自的名称/值对的任何组合一起返回相同数组值内的子对象.但是,即使对于我来说,这也不是一件容易的事.因此,上面的示例输入/输出.


仅出于兴趣,这是我拥有的真实工作代码,属性名称被混淆了:

  jq'.pears [] | {梨:.name,file:.somepath,toBeFiltered:(.appletypes [] | select(.name =="orange")| .bananas [] | {香蕉:.name,apples:.apples []})} | {梨,文件,香蕉:.toBeFiltered.banana,applestem:.toBeFiltered.apples.applestem,orangecomment:(.toBeFiltered.apples.peaches [] | select(.akey =="string")| .avalue.value),行号:(.toBeFiltered.apples.peaches [] | select(.akey =="string")| .line)}'realfile.json 

解决方案

也许您缺少的是jq变量的实用程序:

  .company作为$ company|.sites []|.name作为$ site|.人员[]|{name,id:.UID,$ company,$ site} 

( {$ x} {x:$ x} 的简写.)

但是,也可以通过谨慎使用括号来避免变量.如果您不介意按键的顺序略有不同,则可以这样写:

 (.sites [] |((.personnel [] | {名称,id:.UID})+ {site:.name}))+ {company} 

如果键必须按照Q中显示的顺序,则只需将以下过滤器附加到上述管道中即可:

  {名称,ID,公司,站点} 

Summary:

I have successfully worked out how to unnest objects in jq; however, the working code I have written requires a lot of repetition. I feel it's likely there is a cleaner or less verbose way to achieve this same result and I would like to know what it is.

Example:

With the following nested structure of companies, suppose the goal is to extract the name, ID, company and site for each person listed. (We can ignore the address.)

Input:

{
  "company": "Initrode",
  "sites": [
    {
      "name": "HQ",
      "address": "123 Main Street",
      "personnel": [
        {
          "name": "John Smith",
          "UID": 12345
        },
        {
          "name": "Jane Doe",
          "UID": 23456
        }
      ]
    },
    {
      "name": "Branch Office",
      "address": "Spodunk, Nowhereville",
      "personnel": [
        {
          "name": "Fred Anderson",
          "UID": 56789
        },
        {
          "name": "Bill Jones",
          "UID": 34567
        }
      ]
    }
  ]
}
{
  "company": "Inittech",
  "sites": [
    {
      "name": "Main Office",
      "address": "5678 Avenue Blvd",
      "personnel": [
        {
          "name": "Fred Johnson",
          "UID": 6543
        },
        {
          "name": "James Fredson",
          "UID": 9876
        }
      ]
    },
    {
      "name": "Testing Station",
      "address": "Alaskan Wilderness",
      "personnel": [
        {
          "name": "Sally May",
          "UID": 5432
        },
        {
          "name": "Jack James",
          "UID": 8765
        }
      ]
    }
  ]
}

Working code:

jq '{company,site: .sites[]}|
{company,site: .site.name,personnel: .site.personnel[]}|
{name: .personnel.name,id: .personnel.UID,company,site}' sample.json

Correct output:

{
  "name": "John Smith",
  "id": 12345,
  "company": "Initrode",
  "site": "HQ"
}
{
  "name": "Jane Doe",
  "id": 23456,
  "company": "Initrode",
  "site": "HQ"
}
{
  "name": "Fred Anderson",
  "id": 56789,
  "company": "Initrode",
  "site": "Branch Office"
}
{
  "name": "Bill Jones",
  "id": 34567,
  "company": "Initrode",
  "site": "Branch Office"
}
{
  "name": "Fred Johnson",
  "id": 6543,
  "company": "Inittech",
  "site": "Main Office"
}
{
  "name": "James Fredson",
  "id": 9876,
  "company": "Inittech",
  "site": "Main Office"
}
{
  "name": "Sally May",
  "id": 5432,
  "company": "Inittech",
  "site": "Testing Station"
}
{
  "name": "Jack James",
  "id": 8765,
  "company": "Inittech",
  "site": "Testing Station"
}

The problem:

There is a lot of repetition involved here. Aside from repeating the outer labels at each stage of the pipeline, there's also the repetition of .site and .personnel in the second and third parts of the pipeline respectively.

My real data is much more complicated, so this repetition is even worse and is much harder to read.

Incidentally, here is some NON-WORKING code that I tried earlier for the same goal above:

jq '{company,site: .sites[].name,name: .sites[].personnel[].name,id: .sites[].personnel[].UID}' sample.json

That is much less repetition, but unfortunately it returns every person associated with every ID and site at their company - incorrect results, like a database "cross join" instead of "inner join."

I don't quite know how to describe in words what's needed here, but hopefully the above sample helps make it clear.

One way to describe it is that I'm trying to merge multiple name-value pairs from arrays of sub-objects into the top-level object, without returning together any combinations of name-value pairs taken from different sub-objects within the same array value. But that's not exactly easy to follow even for me; hence the above example input/output.


Just for interest, here is the real working code I have, with attribute names obfuscated:

jq '.pears[]|{pear: .name,file: .somepath,toBeFiltered: (.appletypes[]|select(.name == "orange")|.bananas[]|{banana: .name,apples: .apples[]})}|{pear,file,banana: .toBeFiltered.banana,applestem: .toBeFiltered.apples.applestem,orangecomment: (.toBeFiltered.apples.peaches[]|select(.akey == "string")|.avalue.value),linenumber: (.toBeFiltered.apples.peaches[]|select(.akey == "string")|.line)}' realfile.json

解决方案

Perhaps the thing you're missing is the utility of jq variables:

.company as $company
| .sites[]
| .name as $site
| .personnel[]
| { name, id: .UID, $company, $site }

({$x} is shorthand for { x: $x }.)

However it's also possible to avoid variables by using parentheses with care. If you don't mind the keys being in a slightly different order, you could write:

(.sites[] | ( (.personnel[] | { name, id: .UID} ) +  {site: .name} )) + {company} 

If the keys must be in the order shown in the Q, you could simply append the following filter to the above pipeline:

{name, id, company, site}

这篇关于在jq中取消嵌套对象时,如何避免在管道的每个阶段重设标签?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆