在Redshift中加载JSON数组的内容 [英] Loading contents of json array in redshift
问题描述
我正在设置redshift并从mongo导入数据.我已经成功地将json路径文件用于简单文档,但是现在需要从包含数组的文档中导入.
I'm setting up redshift and importing data from mongo. I have succeeded in using a json path file for a simple document but am now needing to import from a document containing an array.
{
"id":123,
"things":[
{
"foo":321,
"bar":654
},
{
"foo":987,
"bar":567
}
]
}
如何将以上内容加载到这样的表中:
How do I load the above in to a table like so:
select * from things;
id | foo | bar
--------+------+-------
123 | 321 | 654
123 | 987 | 567
还是还有其他方法?
我不能仅仅将json数组存储在varchar(max)列中,因为Things的内容可能超过64K.
I can't just store the json array in a varchar(max) column as the content of Things can exceed 64K.
推荐答案
给出
db.baz.insert({
"myid":123,
"things":[
{
"foo":321,
"bar":654
},
{
"foo":987,
"bar":567
}
]
});
以下内容将显示您想要的字段
The following will display the fields you want
db.baz.find({},{"things.foo":1,"things.bar":1})
db.baz.find({},{"things.foo":1,"things.bar":1} )
要扁平化结果集,请像这样使用聚合
To flatten the result set use aggregation like so
db.baz.aggregate(
{"$group": {"_id": "$myid", "things": { "$push" : {"foo":"$things.foo","bar":"$things.bar"}}}},
{
$project : {
_id:1,
foo : "$things.foo",
bar : "$things.bar"
}
},
{ "$unwind" : "$foo" },
{ "$unwind" : "$bar" }
);
这篇关于在Redshift中加载JSON数组的内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!