为什么在 Hive 中使用 OpenCSVSerde 时所有列都创建为字符串? [英] Why does all columns get created as string when I use OpenCSVSerde in Hive?

查看:36
本文介绍了为什么在 Hive 中使用 OpenCSVSerde 时所有列都创建为字符串?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 OpenCSVSerde 和一些整数和日期列创建一个表.但是列被转换为字符串.这是预期的结果吗?作为一种解决方法,我在这一步之后进行了显式类型转换(这会使整个运行变慢)

I am trying to create a table using the OpenCSVSerde and some integer and date columns. But the columns get converted to String. Is this an expected outcome? As a workaround, I do an explicit type-cast after this step (which makes the complete run slower)

hive> create external table if not exists response(response_id int,lead_id int,creat_date date ) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' WITH SERDEPROPERTIES ('quoteChar' = '"', 'separatorChar' = ',', 'serialization.encoding'='UTF-8', 'escapeChar' = '~')   location '/prod/hive/db/response' TBLPROPERTIES ("serialization.null.format"="");
OK
Time taken: 0.396 seconds
hive> describe formatted response;
OK
# col_name              data_type               comment

response_id             string                  from deserializer
lead_id                 string                  from deserializer
creat_date              string                  from deserializer

源代码,解释数据类型更改为字符串.

Source Code that explains change of datatype to String.

推荐答案

这是 CSVSerDe serde 的已知限制.CSVSerDe 将所有列视为字符串类型.即使您使用此 SerDe 创建具有非字符串列类型的表,DESCRIBE TABLE 输出也会显示字符串列类型.从 SerDe 中检索类型信息.要将表中的列转换为所需类型,您可以在表上创建一个视图,将 CAST 转换为所需类型.

This is known limitation of CSVSerDe serde. CSVSerDe treats all columns to be of type String. Even if you create a table with non-string column types using this SerDe, the DESCRIBE TABLE output would show string column type. The type information is retrieved from the SerDe. To convert columns to the desired type in a table, you can create a view over the table that does the CAST to the desired type.

请参阅此处:CSVSerde 这个融合是关于 CSVSerDe,但是它使用 Open-CSV

See here: CSVSerde This confluence is about CSVSerDe, but it uses Open-CSV

另见此处:https://docs.aws.amazon.com/athena/latest/ug/csv.html

这里:HiveOpenCSVSerde"改变你的表定义

这篇关于为什么在 Hive 中使用 OpenCSVSerde 时所有列都创建为字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆