创建一个〜40K列的配置表 [英] Creating a hive table with ~40K columns

查看:182
本文介绍了创建一个〜40K列的配置表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图创建一个相当大的表格。 〜300万行和〜40K列使用配置单元。首先,我创建一个空表并将数据插入表中。然而,当我尝试这样做时出现错误。

 无法获取IMPLICIT,100次尝试后共享锁默认值。失败:获取锁定时出错:无法获取底层对象上的锁定。在一段时间后重试

查询非常简单:

 创建外部数据库database.dataset(
var1 decimal(10,2),
var2 decimal(10,2),
.. 。
var40000 decimal(10,2)
)location'hdfs:// nameservice1 / root / user1 / project1';

以前有人看到过这个错误吗? Cloudera表示,对列数没有限制,但在这里显然会遇到一些系统限制。



另外,我可以在指定的位置创建一个更小的配置表。 p>

解决方案

在此博客文章中看到可以识别并解决问题: http://gbif.blogspot.com/2014/03/lots-of-columns-with- hive-and-hbase.html



简短回答:配置单元在查询中传递的字符数有限制,但您可以增加改变以下选项:

  alter tableSERDE_PARAMSalter columnPARAM_VALUEtype text; 

由于我使用不同的工具来处理数据(对于上述问题),因为配置单元由于未知原因而失败。如果您遇到类似情况,请尝试此操作并提供更新。


I'm trying to create a fairly large table. ~3 millions rows and ~40K columns using hive. To begin, I'm creating an empty table and inserting the data into the table.

However, I hit an error when trying this.

Unable to acquire IMPLICIT, SHARED lock default after 100 attempts. FAILED: Error in acquiring locks: Locks on the underlying objects cannot be acquire. retry after some time

The query is pretty straightforward:

create external table database.dataset (
var1 decimal(10,2),
var2 decimal(10,2),
...
var40000 decimal(10,2)
) location 'hdfs://nameservice1/root/user1/project1';

Anybody seen this error before? Cloudera says there are no limits on number of columns, but clearly hitting some system limitation here.

Additionally, I can create a smaller hive table in the specified location.

解决方案

Ran across this blog post which appears to identify and fix the problem: http://gbif.blogspot.com/2014/03/lots-of-columns-with-hive-and-hbase.html

Short answer: there is a limit on the number of characters hive will pass in a query, but you can increase that with the following option change:

alter table "SERDE_PARAMS" alter column "PARAM_VALUE" type text;

Untested as I went with a different tool to handle the data (for the problem above) since hive was failing for unknown reasons. If you come across something similar, try this out and give an update please.

这篇关于创建一个〜40K列的配置表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆