无法在字符串数据类型中使用unixtimestamp列类型 [英] Unable to work with unixtimestamp column type in string data type

查看:498
本文介绍了无法在字符串数据类型中使用unixtimestamp列类型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个配置表格来加载JSON数据。在我的JSON中有两个值。数据类型都是字符串。如果我将它们保留为bigint,则在此表上选择以下错误:

 失败,异常java.io.IOException:org .apache.hadoop.hive.serde2.SerDeException:org.codehaus.jackson.JsonParseException:当前令牌(VALUE_STRING)不是数字,不能使用数值访问器
at [Source:java.io.ByteArrayInputStream@3b6c740b; line:1,column:21]

如果我将它改为两个字符串,那么它工作正常。



现在,因为这些列都是字符串,所以我无法使用from_unixtime方法处理这些列。



如果我试图将这些列数据类型从字符串更改为bigint,我会得到以下错误:

  FAILED:执行错误,返回代码1来自org.apache.hadoop.hive.ql.exec.DDLTask。无法更改表格。以下列的类型与其各自位置中的现有列不兼容:uploadtimestamp 

以下是我的create table声明:

 创建表ABC 

uploadTimeStamp bigint
,PDID字符串

,数据数组
<
struct
<
数据:struct
<
单位:字符串
, value:string
,heading:string
,loc:string
,loc1:string
,loc2:string
,loc3:string
,速度:字符串
,xvalue:字符串
,yvalue:string
,zvalue:string
>
,事件:字符串
,PDID:字符串
,`Timestamp`:字符串
,时区:字符串
,版本:字符串
,pii:struct< ;虚设:字符串>
>
>

行格式serde'org.apache.hive.hcatalog.data.JsonSerDe'
存储为文本文件;

我的JSON:

  { uploadTimeStamp: 1488793268598, PDID: 123, 数据:[{ 数据:{ 单元: 转, 值: 100} 事件ID: E1, PDID: 123, 时间戳:1488793268598, 时区:330, 版本: 1.0, PII:{}},{ 数据: { 标题: N, LOC: 假, LOC1: 16.032425, LOC2: 80.770587, 中Loc3: 假, 速度: 10}, 事件ID: 位置, PDID: skga06031430gedvcl1pdid2367, 时间戳:1488793268598, 时区:330, 版本: 1.1, PII:{}},{ 数据:{ x值: 1.1, y值: 1.2, zvalue: 2.2}, 事件ID: AccelerometerInfo, PDID: skga06031430gedvcl1pdid2367, 时间戳:1488793268598, 时区 :330, 版本: 1.0, PII:{}},{ 事件ID: FuelLevel, 数据:{ 值: 50, 单元: 百分比}, 版本: 1.0, 时间戳:1488793268598 PDID: skga06031430gedvcl1pdid2367, 时区:330},{ 数据:{ 单元: KMPH, 值: 70 }, 事件ID: VehicleSpeed, PDID: skga06031430gedvcl1pdid2367, 时间戳:1488793268598, 时区:330, 版本: 1.0, PII:{} }]} 

我可以将这个字符串unixtimestamp转换为标准时间,或者我可以使用bigint这些列?

如果您正在讨论时间戳 Timezone ,那么你可以将它们定义为int / big int类型。

如果你看看它们的定义,你会看到那里没有限定符(),因此它们是JSON文档中的数字类型:

时间戳:1488793268598,时区:330 <






 创建外部表myjson 

uploadTimeStamp字符串
,PDID字符串

,数据数组
< b<
struct
<
数据:struct
<
单位:字符串
,值:字符串
,标题:字符串
,loc3:字符串
,loc:字符串
,loc1:字符串
,loc4:string
,speed:string
,x:string
,y:string
,z:string
>
,EventID:string
,PDID:string
,`Timestamp`:bigint
,Timezone:smallint
,Version:string
,pii:struct< ;虚设:字符串>
>
>

行格式serde'org.apache.hive.hcatalog.data.JsonSerDe'
存储为文本文件
位置'/ tmp / myjson'
;






  + ------------------------ + ------------- ----------- + -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- ---------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------- + 
| myjson.uploadtimestamp | myjson.pdid | myjson.data |
+ ------------------------ + ------------- + ------ -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- ------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- ---------------------------------------- +
| 1486631318873 | 123 | [{ 数据:{ 单元: 转, 值: 0, 标题:NULL, 中Loc3:NULL, LOC:NULL, LOC1:NULL, LOC4日期null, 速度:空, ×:空, Y:空, Z:NULL} 事件ID: E1, PDID: 123, 时间戳:1486631318873, 时区:330, 版本: 1.0, PII:{ 虚设:空}},{ 数据:{ 单元:NULL, 值:空, 标题: N 中Loc3: 假, LOC: 14.022425, LOC1: 78.760587, LOC4: 假, 速度: 10, ×:NULL, Y :空, Z :NULL},事件ID : E2\" , PDID: 123, 时间戳:1486631318873, 时区:330, 版本: 1.1,PII :{ 虚设 :空}},{ 数据 :{ 单元 :NULL, 值 :NULL, 标题 :NULL, 中Loc3 :NULL, LOC :NULL, LOC1\" :NULL, LOC4:空, 速度:空, ×: 1.1, Y: 1.2, Z: 2.2}, 事件ID: E3,PDID : 123\" , 时间戳:1486631318873, 时区:330, 版本: 1.0, PII:{ 虚设:空}},{ 数据:{ 单元:百分比, 值: 50, 标题:空, 中Loc3:空, 禄:空, LOC1:空, LOC4:空, 速度:空, X :空, Y:空, Z:NULL} 事件ID: E4, PDID: 123, 时间戳:1486631318873, 时区:330, 版本:1.0 PII:空},{ d ATA :{ 单元 : KMPH, 值: 70, 标题:NULL, 中Loc3:NULL, LOC:NULL, LOC1:NULL, LOC4:空, 速度:空, ×:空, Y:空, Z:NULL} 事件ID: E5, PDID: 123, 时间戳:1486631318873, 时区 :330,version:1.0,pii:{dummy:null}}] |
+ ------------------------ + ------------- + ------ -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- ------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- ---------------------------------------- +







  1. 即使你有定义的 Timestamp 作为字符串,您仍然可以在将它用于需要bigint的函数中之前将其转换为bigint。



    cast(`Timestamp `as bigint)







 蜂房> (t选择0作为时间戳)从t选择from_unixtime(`timestamp`); 




FAILED:SemanticException [Error 10014]:Line 1:45错误的参数
'timestamp':没有与
类匹配的方法org.apache.hadoop.hive.ql.udf.UDFFromUnixTime with(string)。可能的
选择: FUNC (bigint) FUNC (bigint,string) FUNC (int)
(int,string)




  hive> (t选择0作为时间戳)从t选择from_unixtime(cast(`timestamp` as bigint)); 
OK
1970-01-01 00:00:00


I have a hive table to load JSON data. There are two values in my JSON. Both have data type as string. If I keep them as bigint, then select on this table gives below error:

Failed with exception java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: org.codehaus.jackson.JsonParseException: Current token (VALUE_STRING) not numeric, can not use numeric value accessors
 at [Source: java.io.ByteArrayInputStream@3b6c740b; line: 1, column: 21]

If I change it two string, then it works OK.

Now, because these columns are in string, I am not able to use from_unixtime method for these columns.

If I try to alter these columns data types from string to bigint, I get below error:

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. The following columns have types incompatible with the existing columns in their respective positions : uploadtimestamp

Below is my create table statement:

create table ABC
(
    uploadTimeStamp bigint
   ,PDID            string

   ,data            array
                    <
                        struct
                        <
                            Data:struct
                            <
                                unit:string
                               ,value:string
                               ,heading:string
                               ,loc:string
                               ,loc1:string
                               ,loc2:string
                               ,loc3:string
                               ,speed:string
                               ,xvalue:string
                               ,yvalue:string
                               ,zvalue:string
                            >
                           ,Event:string
                           ,PDID:string
                           ,`Timestamp`:string
                           ,Timezone:string
                           ,Version:string
                           ,pii:struct<dummy:string>
                        >
                    >
)
row format serde 'org.apache.hive.hcatalog.data.JsonSerDe' 
stored as textfile;

My JSON:

{"uploadTimeStamp":"1488793268598","PDID":"123","data":[{"Data":{"unit":"rpm","value":"100"},"EventID":"E1","PDID":"123","Timestamp":1488793268598,"Timezone":330,"Version":"1.0","pii":{}},{"Data":{"heading":"N","loc":"false","loc1":"16.032425","loc2":"80.770587","loc3":"false","speed":"10"},"EventID":"Location","PDID":"skga06031430gedvcl1pdid2367","Timestamp":1488793268598,"Timezone":330,"Version":"1.1","pii":{}},{"Data":{"xvalue":"1.1","yvalue":"1.2","zvalue":"2.2"},"EventID":"AccelerometerInfo","PDID":"skga06031430gedvcl1pdid2367","Timestamp":1488793268598,"Timezone":330,"Version":"1.0","pii":{}},{"EventID":"FuelLevel","Data":{"value":"50","unit":"percentage"},"Version":"1.0","Timestamp":1488793268598,"PDID":"skga06031430gedvcl1pdid2367","Timezone":330},{"Data":{"unit":"kmph","value":"70"},"EventID":"VehicleSpeed","PDID":"skga06031430gedvcl1pdid2367","Timestamp":1488793268598,"Timezone":330,"Version":"1.0","pii":{}}]}

Any ways I can convert this string unixtimestamp to standard time or I can work with bigint for these columns?

解决方案

  1. If you are talking about Timestamp and Timezone then you can define them as int/big int types.
    If you'll look on their definition you'll see that there are no qualifiers (") around the values, therefore they are of numeric types within in the JSON doc:

    "Timestamp":1488793268598,"Timezone":330


create external table myjson
(
    uploadTimeStamp string
   ,PDID            string

   ,data            array
                    <
                        struct
                        <
                            Data:struct
                            <
                                unit:string
                               ,value:string
                               ,heading:string
                               ,loc3:string
                               ,loc:string
                               ,loc1:string
                               ,loc4:string
                               ,speed:string
                               ,x:string
                               ,y:string
                               ,z:string
                            >
                           ,EventID:string
                           ,PDID:string
                           ,`Timestamp`:bigint
                           ,Timezone:smallint
                           ,Version:string
                           ,pii:struct<dummy:string>
                        >
                    >
)
row format serde 'org.apache.hive.hcatalog.data.JsonSerDe' 
stored as textfile
location '/tmp/myjson'
;


+------------------------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| myjson.uploadtimestamp | myjson.pdid |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         myjson.data                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
+------------------------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|          1486631318873 |         123 | [{"data":{"unit":"rpm","value":"0","heading":null,"loc3":null,"loc":null,"loc1":null,"loc4":null,"speed":null,"x":null,"y":null,"z":null},"eventid":"E1","pdid":"123","timestamp":1486631318873,"timezone":330,"version":"1.0","pii":{"dummy":null}},{"data":{"unit":null,"value":null,"heading":"N","loc3":"false","loc":"14.022425","loc1":"78.760587","loc4":"false","speed":"10","x":null,"y":null,"z":null},"eventid":"E2","pdid":"123","timestamp":1486631318873,"timezone":330,"version":"1.1","pii":{"dummy":null}},{"data":{"unit":null,"value":null,"heading":null,"loc3":null,"loc":null,"loc1":null,"loc4":null,"speed":null,"x":"1.1","y":"1.2","z":"2.2"},"eventid":"E3","pdid":"123","timestamp":1486631318873,"timezone":330,"version":"1.0","pii":{"dummy":null}},{"data":{"unit":"percentage","value":"50","heading":null,"loc3":null,"loc":null,"loc1":null,"loc4":null,"speed":null,"x":null,"y":null,"z":null},"eventid":"E4","pdid":"123","timestamp":1486631318873,"timezone":330,"version":"1.0","pii":null},{"data":{"unit":"kmph","value":"70","heading":null,"loc3":null,"loc":null,"loc1":null,"loc4":null,"speed":null,"x":null,"y":null,"z":null},"eventid":"E5","pdid":"123","timestamp":1486631318873,"timezone":330,"version":"1.0","pii":{"dummy":null}}] |
+------------------------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+


  1. Even if you have defined Timestamp as a string you can still cast it to a bigint before using it in a function that requires a bigint.

    cast (`Timestamp` as bigint)


hive> with t as (select '0' as `timestamp`) select from_unixtime(`timestamp`) from t;

FAILED: SemanticException [Error 10014]: Line 1:45 Wrong arguments 'timestamp': No matching method for class org.apache.hadoop.hive.ql.udf.UDFFromUnixTime with (string). Possible choices: FUNC(bigint) FUNC(bigint, string) FUNC(int) FUNC(int, string)

hive> with t as (select '0' as `timestamp`) select from_unixtime(cast(`timestamp` as bigint)) from t;
OK
1970-01-01 00:00:00

这篇关于无法在字符串数据类型中使用unixtimestamp列类型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆