为什么在Hive中使用OpenCSVSerde时所有列都创建为字符串? [英] Why does all columns get created as string when I use OpenCSVSerde in Hive?

查看:514
本文介绍了为什么在Hive中使用OpenCSVSerde时所有列都创建为字符串?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用OpenCSVSerde以及一些整数和日期列创建一个表。但是列将转换为String。这是预期的结果吗?
作为一种解决方法,我在此步骤之后进行了明确的类型转换(这会使完整的运行变慢)

I am trying to create a table using the OpenCSVSerde and some integer and date columns. But the columns get converted to String. Is this an expected outcome? As a workaround, I do an explicit type-cast after this step (which makes the complete run slower)

hive> create external table if not exists response(response_id int,lead_id int,creat_date date ) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' WITH SERDEPROPERTIES ('quoteChar' = '"', 'separatorChar' = '\,', 'serialization.encoding'='UTF-8', 'escapeChar' = '~')   location '/prod/hive/db/response' TBLPROPERTIES ("serialization.null.format"="");
OK
Time taken: 0.396 seconds
hive> describe formatted response;
OK
# col_name              data_type               comment

response_id             string                  from deserializer
lead_id                 string                  from deserializer
creat_date              string                  from deserializer

解释更改的源代码数据类型转换为String。

Source Code that explains change of datatype to String.

推荐答案

这是CSVSerDe serde的已知限制。 CSVSerDe将所有列都视为String类型。即使使用此SerDe创建具有非字符串列类型的表,DESCRIBE TABLE输出也将显示字符串列类型。从SerDe中检索类型信息。要将表中的列转换为所需的类型,可以在将CAST转换为所需类型的表上创建一个视图。

This is known limitation of CSVSerDe serde. CSVSerDe treats all columns to be of type String. Even if you create a table with non-string column types using this SerDe, the DESCRIBE TABLE output would show string column type. The type information is retrieved from the SerDe. To convert columns to the desired type in a table, you can create a view over the table that does the CAST to the desired type.

请参见此处: CSVSerde 这种融合是关于CSVSerDe的,但它使用的是Open-CSV

See here: CSVSerde This confluence is about CSVSerDe, but it uses Open-CSV

也请参见此处: https:// docs.aws.amazon.com/athena/latest/ug/csv.html

此处:配置单元 OpenCSVSerde更改表定义

这篇关于为什么在Hive中使用OpenCSVSerde时所有列都创建为字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆