如何将列聚合到JSON数组中? [英] How to aggregate columns into a JSON array?
本文介绍了如何将列聚合到JSON数组中?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
如何将如下所示的数据转换以将数据存储在ElasticSearch中?
How can I transform data like below in order to store data in ElasticSearch?
这是我将按产品将其汇总到JSON数组中的bean的数据集.
Here is a dataset of a bean that I would aggregate by product into a JSON array.
List<Bean> data = new ArrayList<Bean>();
data.add(new Bean("book","John",59));
data.add(new Bean("book","Björn",61));
data.add(new Bean("tv","Roger",36));
Dataset ds = spark.createDataFrame(data, Bean.class);
ds.show(false);
+------+-------+---------+
|amount|product|purchaser|
+------+-------+---------+
|59 |book |John |
|61 |book |Björn |
|36 |tv |Roger |
+------+-------+---------+
ds = ds.groupBy(col("product")).agg(collect_list(map(ds.col("purchaser"),ds.col("amount")).as("map")));
ds.show(false);
+-------+---------------------------------------------+
|product|collect_list(map(purchaser, amount) AS `map`)|
+-------+---------------------------------------------+
|tv |[[Roger -> 36]] |
|book |[[John -> 59], [Björn -> 61]] |
+-------+---------------------------------------------+
这就是我想要将其转换为:
This is what I want to transform it into:
+-------+------------------------------------------------------------------+
|product|json |
+-------+------------------------------------------------------------------+
|tv |[{purchaser: "Roger", amount:36}] |
|book |[{purchaser: "John", amount:36}, {purchaser: "Björn", amount:61}] |
+-------+------------------------------------------------------------------+
推荐答案
解决方案:
ds.groupBy(col("product"))
.agg(collect_list(to_json(struct(col("purchaser"), col("amount"))).alias("json")));
这篇关于如何将列聚合到JSON数组中?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文