使用awk解析JSON /在bash sed中得到键值对 [英] Parsing json with awk/sed in bash to get key value pair
问题描述
我看过这么多的现有问题,但没有人回答我所期待的。我知道这是很难用SED / AWK来解析在bash JSON,但我只需要每个记录几个键 - 值对每出录音键 - 值对整个列表中。我想这样做,因为这会更快为主要JSON是pretty大着数百万条记录。
I have read many existing questions at SO but none of them answers what I am looking for. I know it is difficult to parse json in bash using sed/awk but I only need a few key-value pairs per record out of a whole list of key-value pairs per record. I want to do this because it will be faster as the main JSON is pretty big with millions of records.
JSON格式就像下面这样:
The JSON format is like following:
{
"documents":
[
{
"title":"a", //needed
"description":"b", //needed
"id":"c", //needed
....(some more:not useful)....
"conversation":
[
{
"message":"",
"id":"d", //not needed
.....(some more)....
"createDate":"e", //not needed
},
...(some more messages)....
],
"createDate":"f", //needed
....(many more labels).....
}
],
....(some more global attributes)....
}
现在为了这个,我需要为需要,但他们共同的关键,使其由简单的sed / awk将得到一个问题,被标记的属性。任何人都可以提出,如果我们可以使用sed / awk完成这件事。如果可能的任何帮助,以达到相同的是AP preciated。
Now for this I require attributes which are marked as needed but their common key make it a problem to get by simple sed/awk. Could anyone suggest if we can do it with sed/awk. if possible any help to achieve the same would be appreciated.
P.S:我知道关于 jsawk
,但我不希望引入任何依赖,所以如果可能的话,请提出SED / AWK的使用。
P.S.: I know about jsawk
but I do not want to introduce any dependency, so if possible please suggest usage of sed/awk.
编辑:下面给出的格式多extries(如文档中,我们有一个列表)
Multiple extries of the format given below(as in document we have a list)
"title":"a",
"description":"b"
"id":"c"
"createDate":"f"
编辑:JSON是没有任何空格。它已被格式化以提高可读性。
The JSON is without any spaces. It has been formated for readability.
推荐答案
我会建议你使用JQ,还是真正的JSON解析器。你不能解析JSON任意常规的前pressions。你可以砍使用awk的东西,但将轻松突破,如果你的输入有你没有预料到的一种形式。
I would advise that you use 'jq', or a real JSON parser. You can't "parse" JSON with arbitrary regular expressions. You could hack something with awk, but that will break easily if your input has a form you didn't anticipate.
因此,答案是,引入廉价的依赖(JQ,或类似的工具)和脚本解决这一问题。除非你正在运行中的路由器该脚本或嵌入式计算机,那么你可以很容易地安装JQ。
So, the answer is, introduce a cheap dependency (jq, or similar tool), and script around that. Unless you're running this script in a router or an embedded computer, chances are you can easily install jq.
这篇关于使用awk解析JSON /在bash sed中得到键值对的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!