蜂巢使用serdeproperties给出错误 [英] hive using serdeproperties gives error

查看:87
本文介绍了蜂巢使用serdeproperties给出错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试创建配置单元表,以便hdfs文件系统具有UTF-8格式,问题是查询给出了错误,不确定我在做什么错.

I am trying to create the hive table so that the hdfs file system have UTF-8 Format, the problem is the query is giving error, not sure what I am doing wrong.

DROP TABLE IF EXISTS output_2057565014;
CREATE TABLE temp.output_2057565014
ROW FORMAT DELIMITED
FIELDS TERMINATED BY 'ธ'
COLLECTION ITEMS TERMINATED BY '|'
MAP KEYS TERMINATED BY '$'
with serdeproperties('serialization.encoding'='UTF-8') 
LOCATION '/tmp/test-2057565014' 
AS
SELECT * from temp.abc

推荐答案

查询给出错误" >是的,但是是哪种类型?也许阅读该错误消息会有所帮助.没有它,这只是猜测.
所以,让我们猜测一下.


"the query is giving error" > yeah, but what kind?? Maybe reading that error message would help. Without it, it's just guesswork.
So, let's guess.


ROW FORMAT DELIMITED子句隐式地假定定界符字符是单个ASCII-7字符,无论是显式定义(在可打印时)还是由其八进制代码定义.

ROW FORMAT DELIMITED clause implicitly assumes that delimiter characters are single ASCII-7 characters, either defined explicitly (when printable) or by their octal code.

因此FIELDS TERMINATED BY 'ธ'无效.

您可以尝试不同的解决方法-在上游文件创建过程中更改定界符;在加载到HDFS之前更改定界符 (例如,使用旧的sed命令);尝试使用 RegExSerde (参见行格式"下的语言手册DLL/CREATE TABLE & SerDe)...

You can try different workarounds -- changing the delimiter in the upstream file creation process; changing the delimiter in situ before loading to HDFS (e.g. with a good old sed command); trying a hard-coded column mapping with RegExSerde (cf. Language Manual DLL / CREATE TABLE under "Row Formats & SerDe")...

这篇关于蜂巢使用serdeproperties给出错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆