这个带有 STRUCT 的简单 Hive 查询的语法错误在哪里? [英] Where is the syntax error on this simple Hive query with STRUCT?

查看：35 发布时间：2021/12/25 20:28:19 hadoop twitter hive hql

本文介绍了这个带有 STRUCT 的简单 Hive 查询的语法错误在哪里?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

让我们在 Hive 中导入一个简单的表:

Let's import a simple table in Hive:

hive> CREATE EXTERNAL TABLE tweets (id BIGINT, id_str STRING, user STRUCT<id:BIGINT, screen_name:STRING>)
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.JsonSerde'
LOCATION '/projets/tweets';

OK
Time taken: 2.253 seconds

hive> describe tweets.user;

OK
id                      bigint                  from deserializer
screen_name             string                  from deserializer
Time taken: 1.151 seconds, Fetched: 2 row(s)

我不知道这里的语法错误在哪里:

I cannot figure out where is the syntax error here:

hive> select user.id from tweets limit 5;
OK
Failed with exception java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating user.id
Time taken: 0.699 seconds

我使用的是 Hive 1.2.1 版.

I am using the version 1.2.1 of Hive.

推荐答案

我终于找到了答案.用于序列化/反序列化 JSON 的 JAR 似乎有问题.默认的 (Apache) 不能很好地处理我拥有的数据.

I finally found the answer. It seems it is a problem with the JAR used to serialize/deserialize the JSON. The default one (Apache) is not able to perform a good job on the data I have.

我尝试了所有这些典型的 JAR(括号中是ROW FORMAT SERDE"的类):

I tried all these typical JAR (in parenthesis, the class for 'ROW FORMAT SERDE'):

hive-json-serde-0.2.jar (org.apache.hadoop.hive.contrib.serde2.JsonSerde)
hive-serdes-1.0-SNAPSHOT.jar (com.cloudera.hive.serde.JSONSerDe)
hive-serde-1.2.1.jar (org.apache.hadoop.hive.serde2.DelimitedJSONSerDe)
hive-serde-1.2.1.jar (org.apache.hadoop.hive.serde2.avro.AvroSerDe)

他们都给了我不同类型的错误.我把它们列在那里，以便下一个人可以谷歌它们:

All of them gave me different kinds of errors. I list them there so the next guy can Google them:

因异常 java.io.IOException 失败:org.apache.hadoop.hive.ql.metadata.HiveException:评估 user.id 时出错
java.lang.ClassCastException: org.json.JSONObject 无法转换为 [Ljava.lang.Object;
失败，异常 java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.Long异常失败
java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: DelimitedJSONSerDe 无法反序列化.
失败，出现异常 java.io.IOException:org.apache.hadoop.hive.serde2.avro.AvroSerdeException: 期待 AvroGenericRecordWritable

最后，工作 JAR 是 json-serde-1.3-jar-with-dependencies.jar，可以在这里.这个正在使用STRUCT"，甚至可以忽略一些格式错误的 JSON.我还必须使用这个类来创建表:

Finally, the working JAR is json-serde-1.3-jar-with-dependencies.jar which can be found here. This one is working with 'STRUCT' and can even ignore some malformed JSON. I have also to use for the creation of the table this class:

 ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
 WITH SERDEPROPERTIES ("ignore.malformed.json" = "true")
 LOCATION ...

如果需要，可以从这里或此处.我尝试了第一个存储库，在添加必要的库后，它对我来说编译得很好.该存储库最近也已更新.

If needed, it is possible to recompile it from here or here. I tried the first repository and it is compiling fine for me, after adding the necessary libs. The repository has also been updated recently.

这篇关于这个带有 STRUCT 的简单 Hive 查询的语法错误在哪里?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

这个带有 STRUCT 的简单 Hive 查询的语法错误在哪里? [英] Where is the syntax error on this simple Hive query with STRUCT?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

这个带有 STRUCT 的简单 Hive 查询的语法错误在哪里? [英] Where is the syntax error on this simple Hive query with STRUCT?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭