查询avro-backed hive表时出错:java.lang.IllegalArgumentException [英] Error when querying avro-backed hive table: java.lang.IllegalArgumentException

查看:145
本文介绍了查询avro-backed hive表时出错:java.lang.IllegalArgumentException的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图在从BigQuery中的原始Google Analytics数据导出的avro文件中创建天蓝色HDInsight上的配置单元表。



它似乎有效。我可以创建表格,并且在运行DESCRIBE时没有错误。但是当我尝试选择结果时,即使我只选择两个非嵌套列,我也会得到一个错误:java.lang.IllegalArgumentException。



我创建了表格:

  DROP TABLE IF EXISTS ga_sessions_20150106; 
CREATE EXTERNAL TABLE IF NOT EXISTS ga_sessions_20150106
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
存储为输入格式
'org .apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION'/ upload / ga_sessions'
TBLPROPERTIES('avro.schema.url'='/ upload / ga_sessions.avsc');
描述ga_sessions_20150106;

以下是avro模式:

  { 类型: 记录, 名称: 根, 字段:[{ 名称: visitorId, 类型:[ 长,空]},{ 名称 : visitNumber, 类型:[ 长, 空]},{ 名称: visitId, 类型:[ 长, 空] },{ 名称: visitStartTime, 类型:[ 长, 空]},{ 名称: 日期, 类型:[ 字符串, 空]}, { 名: 总计, 类型:[{ 类型: 记录, 名: 总计, 田:[{ Name: 访问, 类型: 长, 空]},{ 名称: 命中, 类型:[ 长, 空]},{ 名称: 网页浏览, 类型:[长, 空]},{ 名称: timeOnSite, 类型:[ 长, 空]},{ 名称: 反弹, 类型:[ 长, 空]},{ 名称: 交易, 类型:[ 长, 空]},{ 名称: transactionRevenue, 类型:[ 长,空]},{ 名称 : newVisits, 类型:[ 长, 空]},{ 名称: 屏幕浏览, 类型:[ 长, 空] },{ 名称: uniqueScreenviews, 类型:[ 长, 空]},{ 名称: timeOnScreen, 类型:[ 长, 空]}, { 名称: totalTransactionRevenue, 类型:[ 长, 空]}]}, 空]},{ 名称:TRAF ficSource  类型:[{ 类型: 记录, 名: trafficSource, 田:[{ Name: referralPath, 类型: 串, 空]},{ 名称 : 运动, 类型:[ 字符串, 空]},{ 名称: 源, 类型:[ 字符串, 空] },{ 名称: 介质, 类型:[ 字符串, 空]},{ 名称: 关键字, 类型:[ 字符串, 空]}, { 名称: adContent, 类型:[ 字符串, 空]},{ 名称: adwordsClickInfo, 类型:[{ 类型: 记录, 姓名 : adwordsClickInfo, 字段:[{ 名称: CAMPAIGNID, 类型:[ 长, 空]},{ 名称: adGroupId, 类型:[长, 空]},{ 名称: creativeId, 类型:[ 长, 空]},{ 名称: criteriaId, 类型:[ 长, 空]},{ 名称: 页, 类型:[ 长, 空]},{ 名称: 槽, 类型:[ 字符串,空]},{ 名称 : criteriaParameters, 类型:[ 字符串, 空]},{ 名称: GCLID, 类型:[ 字符串, 空] },{ 名称: 客户ID, 类型:[ 长, 空]},{ 名称: adNetworkType, 类型:[ 字符串, 空]}, { 名称: targetingCriteria, 类型:[{ 类型: 记录, 名称: targetingCriteria, 字段:[{ 名称: boomUserlistId, 类型: [ 长, 空]}]}, 空]}]}, 空]}]}, 空]},{ 名称: 装置, 类型:[{ 类型: 记录, 名称: 装置, 字段:[{ 名称: 浏览器, 类型:[ 字符串, 空]},{ 名称: browserVersion, 类型:[ 字符串, 空]},{ 名称: OperatingSystem的, 类型:[ 字符串, 空]},{ 名称:作业系统版本, 类型:[ 字符串, 空]},{ 名称: isMobile, 类型:[ 布尔, 空]},{ 名称: mobileDeviceBranding, 类型:[ 字符串, 空]},{ 名称: flashVersion, 类型:[ 字符串, 空]},{ 名称: javaEnabled,类型:[ 布尔, 空]},{ 名称: 语言, 类型:[ 字符串, 空]},{ 名称: screenColors, 类型: [ 字符串, 空]},{ 名称: 的屏幕分辨率, 类型:[ 字符串, 空]},{ 名称: deviceCategory, 类型:[字符串  空]}]}, 空]},{ 名称: 地理网, 类型:[{ 类型: 记录, 名称: 地理网, 字段:[{ 名称 : 大陆, 类型:[ 字符串, 空]},{ 名称: 次大陆, 类型:[ 字符串, 空]} ,{ 名称: 国家, 类型:[ 字符串, 空]},{ 名称: 区域, 类型:[ 字符串, 空]},{ 名称: 地铁, 类型:[ 字符串, 空]}]}, 空]},{ 名称: customDimensions, 类型:{ 类型: 阵列,项 :{ 类型 : 记录, 名称: customDimensions, 字段:[{ 名称: 索引, 类型:[ 长, 空]},{ 名称: 值, 类型:[ 字符串, 空]}]}}},{ 名称: 命中, 类型:{ 类型: 阵列,项目 :{ 类型 : 记录, 名: 命中, 田:[{ Name: hitNumber, 类型: 长, 空]},{ 名称: 时间, 类型:[ 长, 空]},{ 名称: 小时, 类型:[ 长, 空]},{名: 分, 类型:[ 长, 空]},{ 名称: 的isSecure, 类型:[ 布尔, 空]},{ 名称: isInteraction, 类型:[ 布尔, 空]},{ 名称: isEntrance, 类型:[ 布尔, 空]},{ 名称:isExit , 类型:[ 布尔, 空]},{ 名称: 引用者, 类型:[ 字符串, 空]},{ 名称: 网页, 类型:[{ 类型: 记录, 名: 网页, 田:[{ Name: PAGEPATH, 类型: 串, 空]} ,{ 名称: 主机名, 类型:[ 字符串, 空]},{ 名称: PAGETITLE, 类型:[ 字符串, 空]},{ 名称: searchKeyword, 类型:[ 字符串, 空]},{ 名称: searchCategory, 类型:[ 字符串, 空]}]}, 空]},{ 名称: 交易, 类型:[{ 类型: 记录, 名称: 交易, 字段:[{ 名称: 的transactionId, 类型:[ 字符串, 空]},{ 名称: transactionRevenue, 类型:[ 长 , 空]},{ 名称: transactionTax, 类型:[ 长, 空]},{ 名称: transactionShipping, 类型:[ 长,空 ]},{ 名称 : 归属, 类型:[ 字符串, 空]},{ 名称: CURRENCYCODE, 类型:[ 字符串, 空 ]},{ 名称: localTransactionRevenue, 类型:[ 长, 空]},{ 名称: localTransactionTax, 类型:[ 长, 空]} ,{ 名称: localTransactionShipping, 类型:[ 长, 空]},{ 名称: transactionCoupon, 类型:[ 字符串, 空]}]} , 空]},{ 名称: 项目, 类型:[{ 类型: 记录, 名称: 项目, 字段:[{ 名称:的transactionId , 类型:[ 字符串, 空]},{ 名称: 产品名称, 类型:[ 字符串, 空]},{ 名称: 产品分类, 类型:[ 字符串, 空]},{ 名称: productSku, 类型:[ 字符串, 空]},{ 名称: itemQuantity,类型:[ 长  空]},{ 名称: itemRevenue, 类型:[ 长,空 ]},{ 名称 : CURRENCYCODE, 类型:[ 字符串, 空]},{ 名称: localItemRevenue, 类型:[ 长, 空 ]}]}, 空]},{ 名称: contentInfo, 类型:[{ 类型: 记录, 名称: contentInfo, 字段:[{名: contentDescription  类型:[ 字符串, 空]}]}, 空]},{ 名称: APPINFO, 类型:[{ 类型: 记录 名称: APPINFO, 字段:[{ 名称: 名称, 类型:[ 字符串, 空]},{ 名称: 版本, 类型:[ 字符串, 空]},{ 名称: ID, 类型:[ 字符串, 空]},{ 名称: installerId, 类型: [ 字符串, 空]},{ 名称: appInstallerId, 类型:[ 字符串, 空]},{ 名称: APPNAME, 类型:[字符串, 空]},{ 名称: appVersion, 类型:[ 字符串, 空]},{ 名称: APPID, 类型:[ 字符串 , 空]},{ 名称: 屏幕名, 类型:[ 字符串, 空]},{ 名称: landingScreenName, 类型:[ 字符串,空 ]},{ 名称 : exitScreenName, 类型:[ 字符串, 空]},{ 名称: screenDepth, 类型:[ 字符串, 空 ]}]}, 空]},{ 名称: exceptionInfo, 类型:[{ 类型: 记录, 名称: exceptionInfo, 字段:[{名: descripti上, 类型:[ 字符串, 空]},{ 名称: isFatal, 类型:[ 布尔, 空]}]}, 空]}, { 名称: eventInfo, 类型:[{ 类型: 记录, 名称: eventInfo, 字段:[{ 名称: eventCategory, 类型:[ 字符串, 空]},{ 名称: eventAction, 类型:[ 字符串, 空]},{ 名称: eventLabel, 类型:[串 空]},{ 名称: eventValue, 类型:[ 长, 空]}]}, 空]},{ 名称: 产品,键入 :{ 类型 : 阵列, 项目:{ 类型: 记录, 名称: 产品, 字段:[{ 名称: productSKU, 类型 :[ 字符串, 空]},{ 名称: v2ProductName, 类型:[ 字符串, 空]},{ 名称: v2ProductCategory, 类型:[ 字符串, 空]},{ 名称: productVariant, 类型:[ 字符串, 空]},{ 名称: productBrand, 类型:[串, 空]},{ 名称: productRevenue, 类型:[ 长, 空]},{ 名称: localProductRevenue, 类型:[ 长, 空]},{ 名称: productPrice, 类型:[ 长, 空]},{ 名称: localProductPrice, 类型:[ 长,空]},{ 名称 : 产品数量  类型:[ 长, 空]},{ 名称: productRefundAmount, 类型:[ 长, 空]},{ 名称: localProductRefundAmount, 类型:[ 长, 空]},{ 名称: isImpression, 类型:[ 布尔, 空]},{ 名称: customDimensions, 类型:{ 类型: 阵列, 项目: customDimensions}},{ 名称: customMetrics, 类型 :{ 类型: 阵列, 项目:{ 类型: 记录, 名称: customMetrics, 字段:[{ 名称: 索引, 类型:[ 长, 空]},{ 名称: 值, 类型:[ 长, 空]}]}}}]}}},{ 名称: 促进 , 类型:{ 类型: 阵列, 项目:{ 类型: 记录, 姓名: 促销, 字段:[{ 名称: promoId,类型 :[ 字符串, 空]},{ 名称: promoName, 类型:[ 字符串, 空]},{ 名称: promoCreative, 类型 :[ 字符串, 空]},{ 名称: promoPosition, 类型:[ 字符串, 空]}]}}},{ 名称: promotionActionInfo,类型 :[{ 类型 : 记录, 名称: promotionActionInfo, 字段:[{ 名称: promoIsView, 类型:[ 布尔, 空]}, { 名称: promoIsClick, 类型:[ 布尔, 空]}]}, 空]},{ 名称: 退, 类型:[{ 类型 : 记录, 姓名: 返还, 字段:[{ 名称: REFUNDAMOUNT, 类型:[ 长, 空] },{ 名称: localRefundAmount, 类型:[ 长, 空]}]}, 空]},{ 名称: eCommerceAction, 类型:[{类型 : 记录  名称: eCommerceAction, 字段:[{ 名称: ACTION_TYPE, 类型:[ 字符串, 空]},{ 名称:步骤, 类型:[ 长, 空]},{ 名称: 选项, 类型:[ 字符串, 空]}]}, 空]}, { 名称: 实验, 类型:{ 类型: 阵列, 项目:{ 类型: 记录, 名称: 实验, 字段:[{命名 : experimentId, 类型:[ 字符串, 空]},{ 名称: 组合, 类型:[ 字符串, 空]}]}}}, { 名称: customVariables, 类型:{ 类型: 阵列, 项目:{ 类型: 记录, 名称: customVariables, 字段:[{命名 : 索引, 类型:[ 长, 空]},{ 名称: customVarName, 类型:[ 字符串, 空]},{ 名 : customVarValue, 类型:[ 字符串, 空]}]}}},{ 名称: customDimensions, 类型:{ 类型: 阵列, 项目: customDimensions}},{ 名称: customMetrics, 类型:{ 类型: 阵列, 项目: customMetrics}},{ 名称: 类型, 类型 : 串, 空]},{ 名: 社会, 类型:[{ 类型: 记录, 名: 社会,网络连接视场 :[{ 名称 : socialInteractionNetwork, 类型:[ 字符串, 空]},{ 名称: socialInteractionAction, 类型:[ 字符串, 空] }]}, 空]}]}}},{ 名称: fullVisitorId, 类型:[ 字符串, 空]},{ 名称: 用户id, 类型 :[string,null]}]} 

以下是DESCRIBE的含义:

 来自反序列化器的visitorid bigint 
来自反序列化器的visitnumber bigint
来自反序列化器的visitid bigint
visitstarttime bigint从反序列化器
反序列化器中的日期字符串
总计结构<访问次数:bigint,hits:bigint,pageviews:bigint,timeonsite:bigint,bounces:bigint,transactions:bigint,transactionrevenue:bigint,newvisits:bigint,screenviews: BIGINT,uniquescreenviews:BIGINT,timeonscreen:BIGINT,totaltransactionrevenue:BIGINT>来自反序列化器
trafficsource结构<引用路径:字符串,广告系列:字符串,来源:字符串,媒介:字符串,关键字:字符串,adcontent:字符串,adwordsclickinfo:结构< campaignid:bigint,adgroupid:bigint,creativeid:bigint,criteriaid :BIGINT,页:BIGINT,槽:串,criteriaparameters:串,GCLID:串,客户ID:BIGINT,adnetworktype:串,targetingcriteria:结构< boomuserlistid:BIGINT>>> from deserializer
device struct< browser:string,browserversion:string,operatingsystem:string,operatingsystemversion:string,ismobile:boolean,mobiledevicebranding:string,flashversion:string,javaenabled:boolean,language:string,screencolors:string,screenresolution :字符串,devicecategory:字符串> from反序列化器
geonetwork struct< continent:string,subcontinent:string,country:string,region:string,metro:string> from deserializer
customdimensions array< struct< index:bigint,value:string>> from反序列化器
命中数组< struct< hitnumber:bigint,time:bigint,hour:bigint,minute:bigint,issecure:boolean,isinteraction:boolean,isentrance:boolean,isexit:boolean,referer:string,page:struct< ; PAGEPATH:字符串,主机名:字符串,PAGETITLE:串,searchkeyword:串,searchcategory:字符串>,事务:结构<的transactionId:串,transactionrevenue:BIGINT,transactiontax:BIGINT,transactionshipping:BIGINT,单位:串,货币代码:串, localtransactionrevenue:BIGINT,localtransactiontax:BIGINT,localtransactionshipping:BIGINT,transactioncoupon:字符串>,项:结构<的transactionId:串,产品名称:串,产品分类:串,productsku:串,itemquantity:BIGINT,itemrevenue:BIGINT,货币代码:串,localitemrevenue :BIGINT>,contentinfo:结构< contentdescription:字符串>,APPINFO:结构<名称:字符串,版本:串,ID:串,installerid:串,appinstallerid:字符串,应用程序名称:串,appversion:串,APPID:串,屏幕名:字符串,landin gscreenname:串,exitscreenname:串,screendepth:字符串>,exceptioninfo:结构<描述:串,isfatal:布尔>,eventinfo:结构< eventcategory:串,eventaction:串,eventlabel:串,eventvalue:BIGINT>,产品:阵列< ;结构< productsku:串,v2productname:串,v2productcategory:串,productvariant:串,productbrand:串,productrevenue:BIGINT,localproductrevenue:BIGINT,productprice:BIGINT,localproductprice:BIGINT,产品数量:BIGINT,productrefundamount:BIGINT,localproductrefundamount:BIGINT ,isimpression:布尔,customdimensions:阵列<结构<指数:BIGINT,值:字符串>>,custommetrics:阵列<结构<指数:BIGINT,值:BIGINT>>>>,促销:阵列<结构< promoid :字符串,promoname:串,promocreative:串,promoposition:串GT;>,promotionactioninfo:结构< promoisview:布尔,promoisclick:布尔>,退款:结构< REFUNDAMOUNT:BIGINT,localrefundamount:BIGINT>,ecommerceaction:结构< ACTION_TYPE:串,S TEP:BIGINT,选项:字符串>,实验:阵列<结构< experimentid:串,组合:串GT;>,customvariables:阵列<结构<指数:BIGINT,customvarname:串,customvarvalue:串GT;>,customdimensions:阵列< ;结构<指数:BIGINT,值:字符串>>,custommetrics:阵列<结构<指数:BIGINT,值:BIGINT>>,类型:字符串,社会:结构< socialinteractionnetwork:串,socialinteractionaction:串GT;>> ;来自反序列化器的
fullvisitorid字符串
来自反序列化器的userid字符串

可以发布更多的日志,如果需要的话,它可以在15 more之后没有更多的细节,但是你可以看到之前发生的事情):

 导致:java.lang.IllegalArgumentException $ b $在java.nio.ByteBuffer.allocate(ByteBuffer.java:330)
在org.apache.avro.io.BinaryDecoder.readBytes( BinaryDecoder.java:288)
at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:112)
at org.apache.avro.file.DataFileReader。< init>(DataFileReader .java:97)
at org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader。< init>(AvroGenericRecordReader.java:81)
at org.apache.hadoop.hive。在org.apache.hadoop.hive.ql上使用ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51)
。 exec.FetchOperator.getRecordReader(FetchOperator.java:498)
at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:588)
... 15 more


解决方案

确定 - 解决此问题。



问题是,当我使用python客户端从谷歌云存储下载文件时,我在需要使用二进制模式时将其写入文本模式(默认)。



我重新下载了它,重新上传了它,并且工作。

I am trying to create a hive table on azure HDInsight from an avro file exported from raw google analytics data in BigQuery.

It seems to work. I can created the table, and there are no errors when I run DESCRIBE. But when I try to select results, even if I select only two non-nested columns, I get a an error: "java.lang.IllegalArgumentException".

Here's how I created the table:

DROP TABLE IF EXISTS ga_sessions_20150106;
CREATE EXTERNAL TABLE IF NOT EXISTS ga_sessions_20150106
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION '/upload/ga_sessions'
TBLPROPERTIES ('avro.schema.url'='/upload/ga_sessions.avsc');
describe ga_sessions_20150106;

Here's the avro schema:

{"type":"record","name":"root","fields":[{"name":"visitorId","type":["long","null"]},{"name":"visitNumber","type":["long","null"]},{"name":"visitId","type":["long","null"]},{"name":"visitStartTime","type":["long","null"]},{"name":"date","type":["string","null"]},{"name":"totals","type":[{"type":"record","name":"totals","fields":[{"name":"visits","type":["long","null"]},{"name":"hits","type":["long","null"]},{"name":"pageviews","type":["long","null"]},{"name":"timeOnSite","type":["long","null"]},{"name":"bounces","type":["long","null"]},{"name":"transactions","type":["long","null"]},{"name":"transactionRevenue","type":["long","null"]},{"name":"newVisits","type":["long","null"]},{"name":"screenviews","type":["long","null"]},{"name":"uniqueScreenviews","type":["long","null"]},{"name":"timeOnScreen","type":["long","null"]},{"name":"totalTransactionRevenue","type":["long","null"]}]},"null"]},{"name":"trafficSource","type":[{"type":"record","name":"trafficSource","fields":[{"name":"referralPath","type":["string","null"]},{"name":"campaign","type":["string","null"]},{"name":"source","type":["string","null"]},{"name":"medium","type":["string","null"]},{"name":"keyword","type":["string","null"]},{"name":"adContent","type":["string","null"]},{"name":"adwordsClickInfo","type":[{"type":"record","name":"adwordsClickInfo","fields":[{"name":"campaignId","type":["long","null"]},{"name":"adGroupId","type":["long","null"]},{"name":"creativeId","type":["long","null"]},{"name":"criteriaId","type":["long","null"]},{"name":"page","type":["long","null"]},{"name":"slot","type":["string","null"]},{"name":"criteriaParameters","type":["string","null"]},{"name":"gclId","type":["string","null"]},{"name":"customerId","type":["long","null"]},{"name":"adNetworkType","type":["string","null"]},{"name":"targetingCriteria","type":[{"type":"record","name":"targetingCriteria","fields":[{"name":"boomUserlistId","type":["long","null"]}]},"null"]}]},"null"]}]},"null"]},{"name":"device","type":[{"type":"record","name":"device","fields":[{"name":"browser","type":["string","null"]},{"name":"browserVersion","type":["string","null"]},{"name":"operatingSystem","type":["string","null"]},{"name":"operatingSystemVersion","type":["string","null"]},{"name":"isMobile","type":["boolean","null"]},{"name":"mobileDeviceBranding","type":["string","null"]},{"name":"flashVersion","type":["string","null"]},{"name":"javaEnabled","type":["boolean","null"]},{"name":"language","type":["string","null"]},{"name":"screenColors","type":["string","null"]},{"name":"screenResolution","type":["string","null"]},{"name":"deviceCategory","type":["string","null"]}]},"null"]},{"name":"geoNetwork","type":[{"type":"record","name":"geoNetwork","fields":[{"name":"continent","type":["string","null"]},{"name":"subContinent","type":["string","null"]},{"name":"country","type":["string","null"]},{"name":"region","type":["string","null"]},{"name":"metro","type":["string","null"]}]},"null"]},{"name":"customDimensions","type":{"type":"array","items":{"type":"record","name":"customDimensions","fields":[{"name":"index","type":["long","null"]},{"name":"value","type":["string","null"]}]}}},{"name":"hits","type":{"type":"array","items":{"type":"record","name":"hits","fields":[{"name":"hitNumber","type":["long","null"]},{"name":"time","type":["long","null"]},{"name":"hour","type":["long","null"]},{"name":"minute","type":["long","null"]},{"name":"isSecure","type":["boolean","null"]},{"name":"isInteraction","type":["boolean","null"]},{"name":"isEntrance","type":["boolean","null"]},{"name":"isExit","type":["boolean","null"]},{"name":"referer","type":["string","null"]},{"name":"page","type":[{"type":"record","name":"page","fields":[{"name":"pagePath","type":["string","null"]},{"name":"hostname","type":["string","null"]},{"name":"pageTitle","type":["string","null"]},{"name":"searchKeyword","type":["string","null"]},{"name":"searchCategory","type":["string","null"]}]},"null"]},{"name":"transaction","type":[{"type":"record","name":"transaction","fields":[{"name":"transactionId","type":["string","null"]},{"name":"transactionRevenue","type":["long","null"]},{"name":"transactionTax","type":["long","null"]},{"name":"transactionShipping","type":["long","null"]},{"name":"affiliation","type":["string","null"]},{"name":"currencyCode","type":["string","null"]},{"name":"localTransactionRevenue","type":["long","null"]},{"name":"localTransactionTax","type":["long","null"]},{"name":"localTransactionShipping","type":["long","null"]},{"name":"transactionCoupon","type":["string","null"]}]},"null"]},{"name":"item","type":[{"type":"record","name":"item","fields":[{"name":"transactionId","type":["string","null"]},{"name":"productName","type":["string","null"]},{"name":"productCategory","type":["string","null"]},{"name":"productSku","type":["string","null"]},{"name":"itemQuantity","type":["long","null"]},{"name":"itemRevenue","type":["long","null"]},{"name":"currencyCode","type":["string","null"]},{"name":"localItemRevenue","type":["long","null"]}]},"null"]},{"name":"contentInfo","type":[{"type":"record","name":"contentInfo","fields":[{"name":"contentDescription","type":["string","null"]}]},"null"]},{"name":"appInfo","type":[{"type":"record","name":"appInfo","fields":[{"name":"name","type":["string","null"]},{"name":"version","type":["string","null"]},{"name":"id","type":["string","null"]},{"name":"installerId","type":["string","null"]},{"name":"appInstallerId","type":["string","null"]},{"name":"appName","type":["string","null"]},{"name":"appVersion","type":["string","null"]},{"name":"appId","type":["string","null"]},{"name":"screenName","type":["string","null"]},{"name":"landingScreenName","type":["string","null"]},{"name":"exitScreenName","type":["string","null"]},{"name":"screenDepth","type":["string","null"]}]},"null"]},{"name":"exceptionInfo","type":[{"type":"record","name":"exceptionInfo","fields":[{"name":"description","type":["string","null"]},{"name":"isFatal","type":["boolean","null"]}]},"null"]},{"name":"eventInfo","type":[{"type":"record","name":"eventInfo","fields":[{"name":"eventCategory","type":["string","null"]},{"name":"eventAction","type":["string","null"]},{"name":"eventLabel","type":["string","null"]},{"name":"eventValue","type":["long","null"]}]},"null"]},{"name":"product","type":{"type":"array","items":{"type":"record","name":"product","fields":[{"name":"productSKU","type":["string","null"]},{"name":"v2ProductName","type":["string","null"]},{"name":"v2ProductCategory","type":["string","null"]},{"name":"productVariant","type":["string","null"]},{"name":"productBrand","type":["string","null"]},{"name":"productRevenue","type":["long","null"]},{"name":"localProductRevenue","type":["long","null"]},{"name":"productPrice","type":["long","null"]},{"name":"localProductPrice","type":["long","null"]},{"name":"productQuantity","type":["long","null"]},{"name":"productRefundAmount","type":["long","null"]},{"name":"localProductRefundAmount","type":["long","null"]},{"name":"isImpression","type":["boolean","null"]},{"name":"customDimensions","type":{"type":"array","items":"customDimensions"}},{"name":"customMetrics","type":{"type":"array","items":{"type":"record","name":"customMetrics","fields":[{"name":"index","type":["long","null"]},{"name":"value","type":["long","null"]}]}}}]}}},{"name":"promotion","type":{"type":"array","items":{"type":"record","name":"promotion","fields":[{"name":"promoId","type":["string","null"]},{"name":"promoName","type":["string","null"]},{"name":"promoCreative","type":["string","null"]},{"name":"promoPosition","type":["string","null"]}]}}},{"name":"promotionActionInfo","type":[{"type":"record","name":"promotionActionInfo","fields":[{"name":"promoIsView","type":["boolean","null"]},{"name":"promoIsClick","type":["boolean","null"]}]},"null"]},{"name":"refund","type":[{"type":"record","name":"refund","fields":[{"name":"refundAmount","type":["long","null"]},{"name":"localRefundAmount","type":["long","null"]}]},"null"]},{"name":"eCommerceAction","type":[{"type":"record","name":"eCommerceAction","fields":[{"name":"action_type","type":["string","null"]},{"name":"step","type":["long","null"]},{"name":"option","type":["string","null"]}]},"null"]},{"name":"experiment","type":{"type":"array","items":{"type":"record","name":"experiment","fields":[{"name":"experimentId","type":["string","null"]},{"name":"combination","type":["string","null"]}]}}},{"name":"customVariables","type":{"type":"array","items":{"type":"record","name":"customVariables","fields":[{"name":"index","type":["long","null"]},{"name":"customVarName","type":["string","null"]},{"name":"customVarValue","type":["string","null"]}]}}},{"name":"customDimensions","type":{"type":"array","items":"customDimensions"}},{"name":"customMetrics","type":{"type":"array","items":"customMetrics"}},{"name":"type","type":["string","null"]},{"name":"social","type":[{"type":"record","name":"social","fields":[{"name":"socialInteractionNetwork","type":["string","null"]},{"name":"socialInteractionAction","type":["string","null"]}]},"null"]}]}}},{"name":"fullVisitorId","type":["string","null"]},{"name":"userId","type":["string","null"]}]}

Here's what comes back with DESCRIBE:

visitorid               bigint                  from deserializer   
visitnumber             bigint                  from deserializer   
visitid                 bigint                  from deserializer   
visitstarttime          bigint                  from deserializer   
date                    string                  from deserializer   
totals                  struct<visits:bigint,hits:bigint,pageviews:bigint,timeonsite:bigint,bounces:bigint,transactions:bigint,transactionrevenue:bigint,newvisits:bigint,screenviews:bigint,uniquescreenviews:bigint,timeonscreen:bigint,totaltransactionrevenue:bigint>   from deserializer   
trafficsource           struct<referralpath:string,campaign:string,source:string,medium:string,keyword:string,adcontent:string,adwordsclickinfo:struct<campaignid:bigint,adgroupid:bigint,creativeid:bigint,criteriaid:bigint,page:bigint,slot:string,criteriaparameters:string,gclid:string,customerid:bigint,adnetworktype:string,targetingcriteria:struct<boomuserlistid:bigint>>>   from deserializer   
device                  struct<browser:string,browserversion:string,operatingsystem:string,operatingsystemversion:string,ismobile:boolean,mobiledevicebranding:string,flashversion:string,javaenabled:boolean,language:string,screencolors:string,screenresolution:string,devicecategory:string>    from deserializer   
geonetwork              struct<continent:string,subcontinent:string,country:string,region:string,metro:string>  from deserializer   
customdimensions        array<struct<index:bigint,value:string>>    from deserializer   
hits                    array<struct<hitnumber:bigint,time:bigint,hour:bigint,minute:bigint,issecure:boolean,isinteraction:boolean,isentrance:boolean,isexit:boolean,referer:string,page:struct<pagepath:string,hostname:string,pagetitle:string,searchkeyword:string,searchcategory:string>,transaction:struct<transactionid:string,transactionrevenue:bigint,transactiontax:bigint,transactionshipping:bigint,affiliation:string,currencycode:string,localtransactionrevenue:bigint,localtransactiontax:bigint,localtransactionshipping:bigint,transactioncoupon:string>,item:struct<transactionid:string,productname:string,productcategory:string,productsku:string,itemquantity:bigint,itemrevenue:bigint,currencycode:string,localitemrevenue:bigint>,contentinfo:struct<contentdescription:string>,appinfo:struct<name:string,version:string,id:string,installerid:string,appinstallerid:string,appname:string,appversion:string,appid:string,screenname:string,landingscreenname:string,exitscreenname:string,screendepth:string>,exceptioninfo:struct<description:string,isfatal:boolean>,eventinfo:struct<eventcategory:string,eventaction:string,eventlabel:string,eventvalue:bigint>,product:array<struct<productsku:string,v2productname:string,v2productcategory:string,productvariant:string,productbrand:string,productrevenue:bigint,localproductrevenue:bigint,productprice:bigint,localproductprice:bigint,productquantity:bigint,productrefundamount:bigint,localproductrefundamount:bigint,isimpression:boolean,customdimensions:array<struct<index:bigint,value:string>>,custommetrics:array<struct<index:bigint,value:bigint>>>>,promotion:array<struct<promoid:string,promoname:string,promocreative:string,promoposition:string>>,promotionactioninfo:struct<promoisview:boolean,promoisclick:boolean>,refund:struct<refundamount:bigint,localrefundamount:bigint>,ecommerceaction:struct<action_type:string,step:bigint,option:string>,experiment:array<struct<experimentid:string,combination:string>>,customvariables:array<struct<index:bigint,customvarname:string,customvarvalue:string>>,customdimensions:array<struct<index:bigint,value:string>>,custommetrics:array<struct<index:bigint,value:bigint>>,type:string,social:struct<socialinteractionnetwork:string,socialinteractionaction:string>>>   from deserializer   
fullvisitorid           string                  from deserializer   
userid                  string                  from deserializer   

Error (i can post more of the log if desired. It doesn't have more details after "15 more", but you can see what's happening prior.):

Caused by: java.lang.IllegalArgumentException
    at java.nio.ByteBuffer.allocate(ByteBuffer.java:330)
    at org.apache.avro.io.BinaryDecoder.readBytes(BinaryDecoder.java:288)
    at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:112)
    at org.apache.avro.file.DataFileReader.<init>(DataFileReader.java:97)
    at org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.<init>(AvroGenericRecordReader.java:81)
    at org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51)
    at org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:498)
    at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:588)
    ... 15 more

解决方案

OK -- this issue is resolved.

The issue is that when I downloaded the file from google cloud storage using python client, I wrote it to file in text mode (the default) when I needed to use binary mode.

I re-downloaded it, re-uploaded it, and it worked.

这篇关于查询avro-backed hive表时出错:java.lang.IllegalArgumentException的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆