使用awk解析JSON /在bash sed中得到键值对 [英] Parsing json with awk/sed in bash to get key value pair

查看:3256
本文介绍了使用awk解析JSON /在bash sed中得到键值对的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我看过这么多的现有问题,但没有人回答我所期待的。我知道这是很难用SED / AWK来解析在bash JSON,但我只需要每个记录几个键 - 值对每出录音键 - 值对整个列表中。我想这样做,因为这会更快为主要JSON是pretty大着数百万条记录。

I have read many existing questions at SO but none of them answers what I am looking for. I know it is difficult to parse json in bash using sed/awk but I only need a few key-value pairs per record out of a whole list of key-value pairs per record. I want to do this because it will be faster as the main JSON is pretty big with millions of records.

JSON格式就像下面这样:

The JSON format is like following:

{
    "documents":
    [
        {
            "title":"a",   //needed
            "description":"b",  //needed
            "id":"c",  //needed
            ....(some more:not useful)....
            "conversation":
            [
                {
                    "message":"",
                    "id":"d",   //not needed
                    .....(some more)....
                    "createDate":"e",   //not needed
                },
                ...(some more messages)....
            ],
            "createDate":"f",  //needed
            ....(many more labels).....
        }
    ],
    ....(some more global attributes)....
}

现在为了这个,我需要为需要,但他们共同的关键,​​使其由简单的sed / awk将得到一个问题,被标记的属性。任何人都可以提出,如果我们可以使用sed / awk完成这件事。如果可能的任何帮助,以达到相同的是AP preciated。

Now for this I require attributes which are marked as needed but their common key make it a problem to get by simple sed/awk. Could anyone suggest if we can do it with sed/awk. if possible any help to achieve the same would be appreciated.

P.S:我知道关于 jsawk ,但我不希望引入任何依赖,所以如果可能的话,请提出SED / AWK的使用。

P.S.: I know about jsawk but I do not want to introduce any dependency, so if possible please suggest usage of sed/awk.

编辑:下面给出的格式多extries(如文档中,我们有一个列表)

Multiple extries of the format given below(as in document we have a list)

"title":"a",
"description":"b"
"id":"c"
"createDate":"f"

编辑:JSON是没有任何空格。它已被格式化以提高可读性。

The JSON is without any spaces. It has been formated for readability.

推荐答案

我会建议你使用JQ,还是真正的JSON解析器。你不能解析JSON任意常规的前pressions。你可以砍使用awk的东西,但将轻松突破,如果你的输入有你没有预料到的一种形式。

I would advise that you use 'jq', or a real JSON parser. You can't "parse" JSON with arbitrary regular expressions. You could hack something with awk, but that will break easily if your input has a form you didn't anticipate.

因此​​,答案是,引入廉价的依赖(JQ,或类似的工具)和脚本解决这一问题。除非你正在运行中的路由器该脚本或嵌入式计算机,那么你可以很容易地安装JQ。

So, the answer is, introduce a cheap dependency (jq, or similar tool), and script around that. Unless you're running this script in a router or an embedded computer, chances are you can easily install jq.

这篇关于使用awk解析JSON /在bash sed中得到键值对的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆