如何访问Hive& Hive中的HBase表反之亦然? [英] How do I access HBase table in Hive & vice-versa?
问题描述
sqoop job
从现有MySQL表导入数据,为我们的项目创建了HBase表。问题是我们的数据分析师团队熟悉MySQL语法,意味着他们可以很容易地查询 HIVE
表。对他们来说,我需要暴露HIVE中的HBase表。我不想通过在HIVE中重新填充数据来复制数据。 我可以在HIVE 中公开HBase表而不重复数据吗??如果是的话,我该怎么做?另外,如果我的HBase表中的插入/更新/删除
数据会将更新的数据显示在HIVE中,而没有任何问题? <有时,我们的数据分析团队会在HIVE中创建表格并填充数据。我可以将它们暴露给HBase吗?如果是的话,怎么样?
HBase-Hive集成:
在hive中为HBase表创建一个外部表
允许您查询HBase数据,在Hive中查询而不需要重复数据。您可以更新或删除HBase表中的数据,也可以在Hive中查看修改后的表。
$ b 示例:
考虑你有一个hbase表,其中列 id
, name
和 email
。
示例hive的外部表命令:
CREATE EXTERNAL TABLE hivehbasetable(key INT,id INT,用户名STRING,密码STRING,电子邮件字符串)存储在'org.apache.hadoop.hive.hbase.HBaseStorageHandler'WITH SERDEPROPERTIES(hbase。 column.mapping=:key,id:id,name:username,name:password,email:email)TBLPROPERTIES(hbase.table.name=hbasetable);
有关Hive-Hbase集成的更多信息,请参阅这里
As a developer, I've created HBase table for our project by importing data from existing MySQL table using sqoop job
. The problem is our data analyst team are familiar with MySQL syntax, implies they can query HIVE
table easily. For them, I need to expose HBase table in HIVE. I don't want to duplicate data by populating data again in HIVE. Also, duplicating data might have consistency issues in future.
Can I expose HBase table in HIVE without duplicating data? If yes, how do I do it? Also, if I insert/update/delete
data in my HBase table will updated data appear in HIVE without any issues?
Sometimes, our data analytic team create table and populate data in HIVE. Can I expose them to HBase? If yes, how?
HBase-Hive Integration:
Creating an external table
in hive for HBase table allows you to query HBase data o be queried in Hive without the need for duplicating data. You can just update or delete data from HBase table and you can view the modified table in Hive too.
Example:
Consider you have an hbase table with columns id
, name
and email
.
Sample external table command for hive:
CREATE EXTERNAL TABLE hivehbasetable(key INT, id INT, username STRING, password STRING, email STRING) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,id:id,name:username,name:password,email:email") TBLPROPERTIES("hbase.table.name" = "hbasetable");
For more information on Hive-Hbase integration look here
这篇关于如何访问Hive& Hive中的HBase表反之亦然?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!