是否可以存储图形hbase?如果是这样,你如何建模数据库以支持图形结构? [英] Is it possible to store graphs hbase? if so how do you model the database to support a graph structure?

查看:109
本文介绍了是否可以存储图形hbase?如果是这样,你如何建模数据库以支持图形结构?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在使用图表来分析大数据。它一直工作伟大,非常有趣,但我想知道做什么,因为数据越来越大?让我知道如果有其他任何解决方案,但我想尝试Hbase,因为它水平扩展,我可以得到hadoop在图表上运行分析(我的大多数代码是已经写在java),但我不确定如何在nosql数据库结构图?我知道每个节点可以是数据库中的一个条目,但我不知道如何建模边缘和添加属性(像节点名称,属性,页面,边缘上的权重等)。



看看hbase / hadoop是如何建模的大表和地图减少后我怀疑有一种方法来做到这一点,但不知道如何。有任何建议吗?



此外,这是否有意义我想要做什么?

解决方案

您可以在HBase / Accumulo中以列方向存储邻接列表时尚。我更熟悉Accumulo(HBase术语可能略有不同),因此您可以使用类似于以下的模式:

  SrcNode RowKey)EdgeType(CF):DestNode(CFQ)Edge / Node Properties(Value)



< ColumnFamily和CFQ = ColumnFamilyQualifier



您还可以使用以下类型将节点/顶点属性存储为单独的行:

  Node(RowKey)PropertyType(CF):PropertyValue(CFQ)PropertyValue(Value)

PropertyValue可以在CFQ或者值



从@Arnon Rotem-Gal-Oz提到的图处理的角度,你可以看看 Apache Giraph ,这是Google Pregel的实施。



使用HBase / Accumulo作为giraph的输入已被最近提交(2012年3月7日)作为对Giraph的新功能请求: HBase / Accumulo输入和输出格式(GIRAPH-153)


I have been playing around with using graphs to analyze big data. Its been working great and really fun but I'm wondering what to do as the data gets bigger and bigger?

Let me know if there's any other solution but I thought of trying Hbase because it scales horizontally and I can get hadoop to run analytics on the graph(most of my code is already written in java), but I'm unsure how to structure a graph on a nosql database? I know each node can be an entry in the database but I'm not sure how to model edges and add properties to them(like name of nodes, attributes, pagerank, weights on edges,etc..).

Seeing how hbase/hadoop is modeled after big tables and map reduce I suspect there is a way to do this but not sure how. Any suggestions?

Also, does this make sense what I'm trying to do? or is it there better solutions for big data graphs?

解决方案

You can store an adjacency list in HBase/Accumulo in a column oriented fashion. I'm more familiar with Accumulo (HBase terminology might be slightly different) so you might use a schema similar to:

SrcNode(RowKey) EdgeType(CF):DestNode(CFQ) Edge/Node Properties(Value)

Where CF=ColumnFamily and CFQ=ColumnFamilyQualifier

You might also store node/vertex properties as separate rows using something like:

Node(RowKey) PropertyType(CF):PropertyValue(CFQ) PropertyValue(Value)

The PropertyValue could be either in the CFQ or the Value

From a graph processing perspective as mentioned by @Arnon Rotem-Gal-Oz you could look at Apache Giraph which is an implementation of Google Pregel. Pregel is the method Google use for large graph processing.

Using HBase/Accumulo as input to giraph has been submitted recently (7 Mar 2012) as a new feature request to Giraph: HBase/Accumulo Input and Output formats (GIRAPH-153)

这篇关于是否可以存储图形hbase?如果是这样,你如何建模数据库以支持图形结构?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆