Hive 外部表中的最大列数 [英] Maximum Number of Columns in Hive External Tables

查看:34
本文介绍了Hive 外部表中的最大列数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在 Amazon 的 EMR 上设置 Hive,以从 DynamoDB 表中提取数据并将其转储到 S3.我已按照此处找到的说明进行操作,并且在大多数情况下都取得了成功我们的桌子.但是,对于一个 DynamoDB 表,我收到一个错误(如下所示).

I'm trying to set up Hive on Amazon's EMR to pull data from a DynamoDB table and dump it to S3. I've followed the instructions found here, and had success with most of our tables. With one DynamoDB table, however, I get an error (shown below).

有问题的表有 lot 列 (>100),将映射减少到其中的一个子集允许脚本运行,所以我假设这是问题,但我找不到任何相关文档.

The table in question has a lot of columns (>100), and cutting the mapping down to only a subset of them allows the script to run, so I'm assuming that this is the problem, but I can't find any documentation around this.

对我可以定义的列数是否有某种硬性限制?或者我可能会在这里遇到其他一些限制?有没有办法解决这个问题?

Is there some sort of hard limit on the number of columns I can define? Or is there some other limit that I'm likely to be hitting here? Is there a way to work around this?

我得到的错误如下:

FAILED: Error in metadata: javax.jdo.JDODataStoreException: Put request failed : INSERT INTO `TABLE_PARAMS` (`PARAM_VALUE`,`TBL_ID`,`PARAM_KEY`) VALUES (?,?,?)
NestedThrowables:
org.datanucleus.store.mapped.exceptions.MappedDatastoreException: INSERT INTO `TABLE_PARAMS` (`PARAM_VALUE`,`TBL_ID`,`PARAM_KEY`) VALUES (?,?,?)
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask

我尝试运行的脚本如下所示:

The script I'm trying to run looks like:

CREATE EXTERNAL TABLE hive_WSOP_DEV_STATS_input (col1 string, col2 string...)
    STORED BY 'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler'
    TBLPROPERTIES ( "dynamodb.table.name" = "DYNAMO_TABLE_NAME",
        "dynamodb.column.mapping" = "col1:col1,col2:col2...");

推荐答案

几年前我遇到了类似的问题.如果我没记错的话,问题是 hive 对写入数据库的查询中的文本长度设置了限制.如果您查看调用堆栈,您可能会发现该变量是否可配置,以及是否可以在何处编辑代码.

I ran into a similar problem a couple of years ago. If I recall correctly the issue is that hive places a limit on the length of text in the query that it is writing into the database. If you look at the call stack you can probably find out if that variable is configurable and if not where to edit the code.

这篇关于Hive 外部表中的最大列数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆