如何在由'^ P'分隔符分隔的数据上构建配置单元表 [英] How to build a hive table on data which is separated by '^P' delimiter

查看:175
本文介绍了如何在由'^ P'分隔符分隔的数据上构建配置单元表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的查询是:
$ b $ pre $ CREATE EXTERNAL TABLE gateway_staging(
poll int,
total int,
transaction_id int,
create_time时间戳,
update_time时间戳

行格式化界限由'^ P'终止;

(我不确定'^ P'是否可以用作分隔符,但试过了)



当我将数据加载到配置单元表中时,结果显示所有字段都是'none'。 数据看起来像:


4307421698 ^ P200 ^ P138193920770 ^ P2017-03-08 02:46:18.021204 ^ P2017-03-08
02:46:18.021204

请帮我解决问题。 解决方案

以下是选项:


  • ...以'\\ \\ 020'(八进制)

  • ...以'16'结尾的字段 / li>
  • ...字段以'\\\'结尾(十六进制)



请注意,有一个与Unicode文字相关的错误('\\\'),假设在版本2.1中修复,所以使用第三个选项将不适用于早期版本。
https://issues.apache.org/jira/browse/HIVE- 13434


My query is:

CREATE EXTERNAL TABLE gateway_staging (
  poll int,
  total int,
  transaction_id int,
  create_time timestamp,
  update_time timestamp
  )
  ROW FORMAT DELIMITED FIELDS TERMINATED BY '^P';

(I am not sure whether '^P' can be used as a delimiter but tried it out)

The result is showing all fields 'none' when I load the data into hive table.

The data looks like:

4307421698^P200^P138193920770^P2017-03-08 02:46:18.021204^P2017-03-08 02:46:18.021204

Please help me out.

解决方案

Here are the options:

  • ... fields terminated by '\020' (Octal)
  • ... fields terminated by '16' (Decimal)
  • ... fields terminated by '\u0010' (Hexadecimal)

Please note that there was a bug related to Unicode literals ('\u0010') that is suppose to be fixed in version 2.1, so using the 3rd option won't work on earlier versions. https://issues.apache.org/jira/browse/HIVE-13434

这篇关于如何在由'^ P'分隔符分隔的数据上构建配置单元表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆