Hive Utf-8编码支持的字符数? [英] Hive Utf-8 Encoding number of characters supported?

查看:73
本文介绍了Hive Utf-8编码支持的字符数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您好,实际上问题如下,我要在配置单元表中插入的数据具有拉丁语单词,并且其格式为utf-8.但是,配置单元仍然无法正确显示它.

Hi actually the problem is as follows the data i want to insert in hive table has latin words and its in utf-8 encoded format. But still hive does not display it properly.

实际数据:-

在配置单元中插入数据

我将表的编码更改为utf-8,下面仍然是蜂巢DDL和命令

I changed the encoding of the table to utf-8 as well still same issue below are the hive DDL and commands

CREATE TABLE IF NOT EXISTS test6
(
CONTACT_RECORD_ID    string,
ACCOUNT    string,
CUST    string,
NUMBER    string,
NUMBER1    string,
NUMBER2    string,
NUMBER3    string,
NUMBER4    string,
NUMBER5    string,
NUMBER6    string,
NUMBER7    string,
LIST    string
)
ROW FORMAT DELIMITED 
FIELDS TERMINATED BY '|';
ALTER TABLE test6 SET serdeproperties ('serialization.encoding'='UTF-8');

配置单元仅支持UTF-8的前128个字符吗?请提出建议.

Does hive support only the first 128 characters of UTF-8? Please do suggest.

推荐答案

这可能不是理想的解决方案,但这是可行的.Hive似乎并没有将它们视为UTF8.请尝试使用以下参数创建表:

this may not be ideal solution , but this works. Hive somehow doesn't seem to treat them as UTF8. Please try to create the table with following parameters:

CREATE TABLE testjoins.yt_sample_mapping_1(
   `col1` string,
   `col2` string,
   `col3` string)
   ROW FORMAT SERDE "org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe"
   WITH SERDEPROPERTIES ( "separatorChar" = ",", 
    "quoteChar" = "\"", 
    "escapeChar" = "\\", 
    "serialization.encoding"='ISO-8859-1') 
    TBLPROPERTIES ( 'store.charset'='ISO-8859-1', 
    'retrieve.charset'='ISO-8859-1');

这篇关于Hive Utf-8编码支持的字符数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆