如何在由'^ P'分隔符分隔的数据上构建配置单元表 [英] How to build a hive table on data which is separated by '^P' delimiter
问题描述
我的查询是:
$ b $ pre $ CREATE EXTERNAL TABLE gateway_staging(
poll int,
total int,
transaction_id int,
create_time时间戳,
update_time时间戳
)
行格式化界限由'^ P'终止;
(我不确定'^ P'是否可以用作分隔符,但试过了)
当我将数据加载到配置单元表中时,结果显示所有字段都是'none'。 数据看起来像:
4307421698 ^ P200 ^ P138193920770 ^ P2017-03-08 02:46:18.021204 ^ P2017-03-08
02:46:18.021204
请帮我解决问题。 解决方案
以下是选项:
...以'\\ \\ 020'
(八进制)
...以'16'结尾的字段
/ li>
...字段以'\\\'结尾
(十六进制)
请注意,有一个与Unicode文字相关的错误('\\\'),假设在版本2.1中修复,所以使用第三个选项将不适用于早期版本。
https://issues.apache.org/jira/browse/HIVE- 13434
My query is:
CREATE EXTERNAL TABLE gateway_staging (
poll int,
total int,
transaction_id int,
create_time timestamp,
update_time timestamp
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '^P';
(I am not sure whether '^P' can be used as a delimiter but tried it out)
The result is showing all fields 'none' when I load the data into hive table.
The data looks like:
4307421698^P200^P138193920770^P2017-03-08 02:46:18.021204^P2017-03-08 02:46:18.021204
Please help me out.
Here are the options:
... fields terminated by '\020'
(Octal)... fields terminated by '16'
(Decimal)... fields terminated by '\u0010'
(Hexadecimal)
Please note that there was a bug related to Unicode literals ('\u0010') that is suppose to be fixed in version 2.1, so using the 3rd option won't work on earlier versions. https://issues.apache.org/jira/browse/HIVE-13434
这篇关于如何在由'^ P'分隔符分隔的数据上构建配置单元表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!