Spark SQL JSON数据集查询嵌套数据结构 [英] Spark SQL JSON dataset query nested datastructures
本文介绍了Spark SQL JSON数据集查询嵌套数据结构的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个简单的JSON数据集,如下所示.如何查询所有parts.lock
中的id
= 1
.
I have a simple JSON dataset as below. How do I query all parts.lock
for the id
=1
.
JSON:
{
"id": 1,
"name": "A green door",
"price": 12.50,
"tags": ["home", "green"],
"parts" : [
{
"lock" : "One lock",
"key" : "single key"
},
{
"lock" : "2 lock",
"key" : "2 key"
}
]
}
查询:
select id,name,price,parts.lockfrom product where id=1
要点是,如果我使用parts[0].lock
,它将返回如下一行:
The point is if I use parts[0].lock
it will return one row as below:
{u'price': 12.5, u'id': 1, u'.lock': {u'lock': u'One lock', u'key': u'single key'}, u'name': u'A green door'}
但是我想返回parts
结构中的所有locks
.它会返回多行,但这就是我要寻找的行.我想要完成的这种关系联接.
But I want to return all the locks
in the parts
structure. It will return multiple rows but that's the one I am looking for. This kind of a relational join which I want to accomplish.
请帮助我
推荐答案
df.select($"id", $"name", $"price", explode($"parts").alias("elem"))
.where("id = 1")
.select("id", "name", "price", "elem.lock", "elem.key").show
+---+------------+-----+--------+----------+
| id| name|price| lock| key|
+---+------------+-----+--------+----------+
| 1|A green door| 12.5|One lock|single key|
| 1|A green door| 12.5| 2 lock| 2 key|
+---+------------+-----+--------+----------+
这篇关于Spark SQL JSON数据集查询嵌套数据结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文