导入json数组到蜂巢 [英] importing json array into hive

查看：81 发布时间：2020/11/22 2:41:04 arrays json hadoop hive

本文介绍了导入json数组到蜂巢的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试在配置单元中导入以下json

I'm trying to import the following json in hive

[{"time":1521115600，"latitude":44.3959，"longitude":26.1025，"altitude":53，"pm1":21.70905，"pm25":16.5，"pm10":14.60085，"gas1" :0，"gas2":0.12，"gas3":0，"gas4":0，温度":null，压力":0，湿度":0，噪音":0}，{时间" :1521115659，纬度":44.3959，经度":26.1025，海拔":53，"pm1":24.34045，"pm25":18.5，"pm10":16.37065，"gas1":0，"gas2":0.08 ，"gas3":0，"gas4":0，温度":无，压力":0，湿度":0，噪声":0}，{时间":1521115720，纬度":44.3959 ，经度":26.1025，海拔":53，"pm1":23.6826，"pm25":18，"pm10":15.9282，"gas1":0，"gas2":0，"gas3":0，" gas4:0，"温度:无，"压力:0，"湿度:0，"噪音:0}，{"时间:1521115779，"纬度:44.3959，"经度:26.1025，"高度:53，" pm1:25.65615，" pm25:19.5，" pm10:17.25555，" gas1:0，" gas2:0.04，" gas3:0，" gas4:0，"温度":null，压力":0，湿度":0，噪音":0}]

[{"time":1521115600,"latitude":44.3959,"longitude":26.1025,"altitude":53,"pm1":21.70905,"pm25":16.5,"pm10":14.60085,"gas1":0,"gas2":0.12,"gas3":0,"gas4":0,"temperature":null,"pressure":0,"humidity":0,"noise":0},{"time":1521115659,"latitude":44.3959,"longitude":26.1025,"altitude":53,"pm1":24.34045,"pm25":18.5,"pm10":16.37065,"gas1":0,"gas2":0.08,"gas3":0,"gas4":0,"temperature":null,"pressure":0,"humidity":0,"noise":0},{"time":1521115720,"latitude":44.3959,"longitude":26.1025,"altitude":53,"pm1":23.6826,"pm25":18,"pm10":15.9282,"gas1":0,"gas2":0,"gas3":0,"gas4":0,"temperature":null,"pressure":0,"humidity":0,"noise":0},{"time":1521115779,"latitude":44.3959,"longitude":26.1025,"altitude":53,"pm1":25.65615,"pm25":19.5,"pm10":17.25555,"gas1":0,"gas2":0.04,"gas3":0,"gas4":0,"temperature":null,"pressure":0,"humidity":0,"noise":0}]

CREATE TABLE json_serde (
 s array<struct<time: timestamp, latitude: string, longitude: string, pm1: string>>)
 ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
 WITH SERDEPROPERTIES (
     'mapping.value' = 'value'
 )
 STORED AS TEXTFILE
location '/user/hduser';

导入有效，但如果我尝试

the import works but if i try

Select * from json_serde;

它将仅从hadoop/user/hduser上的每个文档中返回每个文件的第一个元素.

it will return from every document that is on hadoop/user/hduser only the first element per file.

关于使用json数组有很好的文档?

there is a good documentation on working with json array??

推荐答案

如果无法使用更新输入文件格式，则可以在数据完成后直接导入spark并使用它，一旦数据确定后再写回Hive表.

If you can not use update your input file format you can directly import in spark and use it, once data is finalized write back to Hive table.

scala> val myjs = spark.read.format("json").option("path","file:///root/tmp/test5").load()
myjs: org.apache.spark.sql.DataFrame = [altitude: bigint, gas1: bigint ... 13 more fields]

scala> myjs.show()
+--------+----+----+----+----+--------+--------+---------+-----+--------+--------+----+--------+-----------+----------+
|altitude|gas1|gas2|gas3|gas4|humidity|latitude|longitude|noise|     pm1|    pm10|pm25|pressure|temperature|      time|
+--------+----+----+----+----+--------+--------+---------+-----+--------+--------+----+--------+-----------+----------+
|      53|   0|0.12|   0|   0|       0| 44.3959|  26.1025|    0|21.70905|14.60085|16.5|       0|       null|1521115600|
|      53|   0|0.08|   0|   0|       0| 44.3959|  26.1025|    0|24.34045|16.37065|18.5|       0|       null|1521115659|
|      53|   0| 0.0|   0|   0|       0| 44.3959|  26.1025|    0| 23.6826| 15.9282|18.0|       0|       null|1521115720|
|      53|   0|0.04|   0|   0|       0| 44.3959|  26.1025|    0|25.65615|17.25555|19.5|       0|       null|1521115779|
+--------+----+----+----+----+--------+--------+---------+-----+--------+--------+----+--------+-----------+----------+


scala> myjs.write.json("file:///root/tmp/test_output")

或者，您也可以直接配置表格

Alternatively you can directly hive table

   scala> myjs.createOrReplaceTempView("myjs")

    scala> spark.sql("select * from myjs").show()

    scala> spark.sql("create table tax.myjs_hive as select * from myjs")

这篇关于导入json数组到蜂巢的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

导入json数组到蜂巢 [英] importing json array into hive

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

导入json数组到蜂巢 [英] importing json array into hive

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭