Hive Utf-8编码支持的字符数? [英] Hive Utf-8 Encoding number of characters supported?
问题描述
您好,实际上问题如下,我要在配置单元表中插入的数据具有拉丁语单词,并且其格式为utf-8.但是,配置单元仍然无法正确显示它.
Hi actually the problem is as follows the data i want to insert in hive table has latin words and its in utf-8 encoded format. But still hive does not display it properly.
实际数据:-
在配置单元中插入数据
我将表的编码更改为utf-8,下面仍然是蜂巢DDL和命令
I changed the encoding of the table to utf-8 as well still same issue below are the hive DDL and commands
CREATE TABLE IF NOT EXISTS test6
(
CONTACT_RECORD_ID string,
ACCOUNT string,
CUST string,
NUMBER string,
NUMBER1 string,
NUMBER2 string,
NUMBER3 string,
NUMBER4 string,
NUMBER5 string,
NUMBER6 string,
NUMBER7 string,
LIST string
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|';
ALTER TABLE test6 SET serdeproperties ('serialization.encoding'='UTF-8');
配置单元仅支持UTF-8的前128个字符吗?请提出建议.
Does hive support only the first 128 characters of UTF-8? Please do suggest.
推荐答案
这可能不是理想的解决方案,但这是可行的.Hive似乎并没有将它们视为UTF8.请尝试使用以下参数创建表:
this may not be ideal solution , but this works. Hive somehow doesn't seem to treat them as UTF8. Please try to create the table with following parameters:
CREATE TABLE testjoins.yt_sample_mapping_1(
`col1` string,
`col2` string,
`col3` string)
ROW FORMAT SERDE "org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe"
WITH SERDEPROPERTIES ( "separatorChar" = ",",
"quoteChar" = "\"",
"escapeChar" = "\\",
"serialization.encoding"='ISO-8859-1')
TBLPROPERTIES ( 'store.charset'='ISO-8859-1',
'retrieve.charset'='ISO-8859-1');
这篇关于Hive Utf-8编码支持的字符数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!