配置单元在一个表上运行选择时锁定整个数据库 [英] Hive Locks entire database when running select on one table
问题描述
有人知道解决方法吗?以下链接建议关闭并发性,但我们不能这样做,因为我们正在替换整个表,并且我们必须确保在替换整个内容之前没有select语句访问表。
http:// mail- archives.apache.org/mod_mbox/hive-user/201408.mbox/%3C0eba01cfc035 $ 3501e4f0 $ 9f05aed0 $ @ com%3E
使用mydatabase;
从large_table限制1中选择count(*); #this table is large and hive.support.concurrency = true`
在另一个hive shell中,同时第一个查询正在执行:
use mydatabase;
创建表sometable(id字符串)行格式DELIMITED FIELDS TERMINATED BY'\t'STORED AS TEXTFILE;
问题在于create table不会执行,直到第一个查询(select)完成。
更新:
我们使用Cloudera的Hive CDH-5.2.1-1分发版,我们看到这个问题。
我认为他们从来没有在Hive 0.13中做过这样的事情。请确认您的资源管理器,并且在执行多个Hive查询时看到您有足够的内存。
正如您所知,每个Hive查询都会触发映射缩减作业,如果YARN没有足够的资源,它会等到上一个运行的作业完成。请从内存的角度来处理您的问题。
一切顺利!!
HIVE 0.13 will SHARED lock the entire database(I see a node like LOCK-0000000000 as a child of the database node in Zookeeper) when running a select statement on any table in the database. HIVE creates a shared lock on the entire schema even when running a select statement - this results in a freeze on CREATE/DELETE statements on other tables in the database until the original query finishes and the lock is released.
Does anybody know a way around this? Following link suggests concurrency to be turned off but we can't do that as we are replacing the entire table and we have to make sure that no select statement is accessing the table before we replace the entire contents.
http://mail-archives.apache.org/mod_mbox/hive-user/201408.mbox/%3C0eba01cfc035$3501e4f0$9f05aed0$@com%3E
use mydatabase;
select count(*) from large_table limit 1; # this table is very large and hive.support.concurrency=true`
In another hive shell, meanwhile the 1st query is executing:
use mydatabase;
create table sometable (id string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE ;
The problem is that the "create table" does not execute untill the first query (select) has finished.
Update: We are using Cloudera's distribution of Hive CDH-5.2.1-1 and we are seeing this issue.
I think they never made such that in Hive 0.13. Please verify your Resource manager and see that you have enough memory when you are executing multiple Hive queries.
As you know each Hive query will trigger a map reduce job and if YARN doesn't have enough resources it will wait till the previous running job completes. Please approach your issue from memory point of view.
All the best !!
这篇关于配置单元在一个表上运行选择时锁定整个数据库的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!