蜂巢使用serdeproperties给出错误 [英] hive using serdeproperties gives error
问题描述
我正在尝试创建配置单元表,以便hdfs文件系统具有UTF-8格式,问题是查询给出了错误,不确定我在做什么错.
I am trying to create the hive table so that the hdfs file system have UTF-8 Format, the problem is the query is giving error, not sure what I am doing wrong.
DROP TABLE IF EXISTS output_2057565014;
CREATE TABLE temp.output_2057565014
ROW FORMAT DELIMITED
FIELDS TERMINATED BY 'ธ'
COLLECTION ITEMS TERMINATED BY '|'
MAP KEYS TERMINATED BY '$'
with serdeproperties('serialization.encoding'='UTF-8')
LOCATION '/tmp/test-2057565014'
AS
SELECT * from temp.abc
推荐答案
查询给出错误" >是的,但是是哪种类型?也许阅读该错误消息会有所帮助.没有它,这只是猜测.
所以,让我们猜测一下.
"the query is giving error" > yeah, but what kind?? Maybe reading that error message would help. Without it, it's just guesswork.
So, let's guess.
ROW FORMAT DELIMITED
子句隐式地假定定界符字符是单个ASCII-7字符,无论是显式定义(在可打印时)还是由其八进制代码定义.
ROW FORMAT DELIMITED
clause implicitly assumes that delimiter characters are single ASCII-7 characters, either defined explicitly (when printable) or by their octal code.
因此FIELDS TERMINATED BY 'ธ'
无效.
您可以尝试不同的解决方法-在上游文件创建过程中更改定界符;在加载到HDFS之前更改定界符 (例如,使用旧的sed
命令);尝试使用 RegExSerde (参见行格式"下的语言手册DLL/CREATE TABLE & SerDe)...
You can try different workarounds -- changing the delimiter in the upstream file creation process; changing the delimiter in situ before loading to HDFS (e.g. with a good old sed
command); trying a hard-coded column mapping with RegExSerde (cf. Language Manual DLL / CREATE TABLE under "Row Formats & SerDe")...
这篇关于蜂巢使用serdeproperties给出错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!