Hbase架构嵌套实体 [英] Hbase Schema Nested Entity

查看:303
本文介绍了Hbase架构嵌套实体的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有没有人有关于如何创建一个嵌套实体的Hbase表的例子?

示例

用户名(字符串)
SSN(字符串)
+书籍(集合)

书籍集合看起来像这样

书籍

  isbn 
title
等...



<我找不到一个例子是如何创建一个像这样的表。我看到很多人都在谈论它,在特定情况下它是如何成为最佳实践,但我无法找到一个关于如何在任何地方实现的例子。



谢谢.. 。嵌套实体不是HBase的官方功能;嵌套实体不是HBase的官方功能;嵌套实体不是HBase的官方功能;嵌套实体不是HBase的官方功能。这只是一些人谈论一种使用模式的方式。在这种模式中,您使用HBase中的列实际上只是一个大图(一组键/值对),以便您可以通过添加对行中基数的维进行建模每个行嵌套实体一列。



架构方面,您不需要在表本身做太多工作;当您在HBase中创建表时,只需指定名称&列族(和相关属性),像这样(在hbase shell中):

  hbase:001:0>创建'UserWithBooks','cf1'

然后,这取决于您放入的内容,列明智的。您可以插入如下值:

  hbase:002:0>把'UsersWithBooks','userid1234','cf1:username','我的用户名'
hbase:003:0>把'UsersWithBooks','userid1234','cf1:ssn','my ssn'
hbase:004:0>放置'UsersWithBooks','userid1234','cf1:book_id_12345','< isbn> 12345< / isbn>< title>玛丽有一点羊肉< / title>'
hbase:005:0>将'UsersWithBooks','userid1234','cf1:book_id_67890','< isbn> 67890< / isbn>< title>认真的重要性< / title>'

列名完全取决于您,您可以拥有的数量没有限制(有理由:请参阅HBase参考指南了解更多信息)。当然,要做到这一点,你必须做好自己的工作:放入和取出数值(你可能会用比java更复杂的方式来处理这些shell命令,仅用于解释目的)。尽管您可以通过键高效地扫描表格中列的一部分(使用列分页过滤器),但除了拉取它们并在其他地方解析它们之外,您无法对单元格内容做任何处理。



你为什么要这样做?可能恰好如果你想要一个父行的所有嵌套行的原子性。这不是很常见,你最好的选择可能是先把它们建模成单独的表格,如果你真的明白这个折衷方案,那么就只有采用这种方法。


Does anyone have an example on how to create an Hbase table with a nested entity?

Example

UserName (string)
SSN  (string)
  + Books (collection)

The books collection would look like this for example

Books

isbn
title
etc...

I cannot find a single example are how to create a table like this. I see many people talk about it, and how it is a best practice in certain scenarios, but I cannot find an example on how to do it anywhere.

Thanks...

解决方案

Nested entities isn't an official feature of HBase; it's just a way some people talk about one usage pattern. In this pattern, you use the fact that "columns" in HBase are really just a big map (a bunch of key/value pairs) to let you to model a dimension of cardinality inside the row by adding one column per "row" of the nested entity.

Schema-wise, you don't need to do much on the table itself; when you create a table in HBase, you just specify the name & column family (and associated properties), like so (in hbase shell):

hbase:001:0> create 'UserWithBooks', 'cf1'

Then, it's up to you what you put in it, column wise. You could insert values like:

hbase:002:0> put 'UsersWithBooks', 'userid1234', 'cf1:username', 'my username'
hbase:003:0> put 'UsersWithBooks', 'userid1234', 'cf1:ssn', 'my ssn'
hbase:004:0> put 'UsersWithBooks', 'userid1234', 'cf1:book_id_12345', '<isbn>12345</isbn><title>mary had a little lamb</title>'
hbase:005:0> put 'UsersWithBooks', 'userid1234', 'cf1:book_id_67890', '<isbn>67890</isbn><title>the importance of being earnest</title>'

The column names are totally up to you, and there's no limit to how many you can have (within reason: see the HBase Reference Guide for more on this). Of course, doing this, you have to do your own legwork re: putting in and getting out values (and you'd probably do it with the java client in a more sophisticated way than I'm doing with these shell commands, they're just for explanatory purposes). And while you can efficiently scan just a portion of the columns in a table by key (using a column pagination filter), you can't do much with the contents of the cells other than pull them and parse them elsewhere.

Why would you do this? Probably just if you wanted atomicity around all the nested rows for one parent row. It's not very common, your best bet is probably to start by modeling them as separate tables, and only move to this approach if you really understand the tradeoffs.

这篇关于Hbase架构嵌套实体的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆