在Neo4J中存储多个图表 [英] Storing multiple graphs in Neo4J

查看:150
本文介绍了在Neo4J中存储多个图表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个将关系信息存储在MySQL表(contact_id,other_contact_id,strength,recorded_at)中的应用程序。如果我只需要显示联系人的关系,或者甚至为两个联系人生成相互联系人列表,那么这很好。



但现在我需要生成统计如下:2011年1月3强或更好的双向连接总数是多少(假设每个联系人都是团体的一部分)哪个团体与其他团体的连接数最多等等。



我很快发现用于生成这些统计信息的SQL变得很难实现。

所以我写了一个脚本对于任何给定的日期,它将在内存中生成一个图形。然后,我可以运行任何我想要的图表。我的下一个想法是缓存这些图表,这样我就可以打电话给他们了。他们每当我需要运行一个新的统计数据时(或者生成一个后面的图表:例如对于今天的图表,我采用昨天的图表并应用自昨天以来发生的任何变化)。我尝试了memcached,直到图表增长> 1 MB。



现在我正在考虑使用像Neo4J这样的图形数据库。



唯一的问题是,我没有一张图。或者我这样做,但它是随着时间而变化的,我需要能够用不同的参考时间来查询它。



那么,我可以:




  • 将多个图存储在Neo4J中,并分别对它们进行重新约定/交互?我会为每个日期创建并存储单独的社交图表。






  • 为每个边添加有效的时间戳,并适当地过滤图形:所以如果我想为May 1st创建图形,我只会沿着两个在 5月1日(如果所有边缘都是在5月1日之后创建的,那么这些节点将不会连接)。 对于图数据库来说很新颖,所以任何帮助/指针/提示都将受到赞赏。

    解决方案

    现在,您只能存储一个图形数据库在一个Neo4j实例中,但是这个graphdb可以包含尽可能多的不同的子图。在进行全局操作(如索引查询)时,您只需要牢记这一点,但在那里您可以执行包含时间戳属性的复合查询以及限制结果。



    这样做的一种方式是,正如你所说的那样,添加时间信息到边缘以表示给定日期的图形结构,那么你就可以遍历图表的结构。



    每天使用类别节点(并将它们链接起来并将它们聚合到更高级别的时间跨度中)是更加图形化的方式对索引属性进行分类的节点。 (实际上,这些都是图内指标,您可以轻松地将其包含在遍历和图形查询中)。



    只要您是只对不同的时间结构感兴趣。如果您的节点也不同(例如更改属性,则可以复制它们,从而有效地创建不同的子图),或者在每个节点上创建历史节点的连接列表(仅包含更改)(或根据您的要求创建完整快照) 。



    您的域名听起来非常适合图形数据库。如果您有更多详细的问题可以随意加入Neo4j 邮件列表


    I have an application that stores relationship information in a MySQL table (contact_id, other_contact_id, strength, recorded_at). This is fine if all I need to do is show who a contact's relationships are or even to generate a list of mutual contacts for two contacts.

    But now I need to generate stats like: 'what was the total number of 2-way connections of strength 3 or better in January 2011' or (assuming that each contact is part of a group) 'which group has the most number of connections to other groups' etc.

    I quickly found that the SQL for generating these stats became unwieldy real fast.

    So I wrote a script that for any given date it will generate a graph in memory. I could then run whatever stat I wanted against that graph. Much easier to understand and in general, much more performant also -- except for the generating the graph part.

    My next thought was to cache those graphs so I could call on them whenever I needed to run a new stat (or generate a later graph: eg for today's graph I take yesterday's graph and apply any changes that happened since yesterday). I tried memcached which worked great until the graphs grew > 1 MB.

    So now I'm thinking about using a graph database like Neo4J.

    Only problem is, I don't have just one graph. Or I do, but it is one that changes over time and I need to be able to query it with different reference times.

    So, can I:

    • store multiple graphs in Neo4J and rertrieve/interact with them separately? i would then create and store separate social graphs for each date.

    or

    • add valid to and from timestamps to each edge and filter the graph appropriately: so if i wanted a graph for "May 1st" i would only follow the newest edge between two noeds that was created before "May 1st" (and if all the edges were created after May 1st then those nodes wouldn't be connected).

    I'm pretty new to graph databases so any help/pointers/hints will be appreciated.

    解决方案

    Right now you can store just one graph database in a single Neo4j instance, but this one graphdb can contain as many different sub-graphs as you like. You only have to keep that in mind when doing global operations (like index queries) but there you can do compound queries that include timestamped properties as well to limit the results.

    One way of doing that is, as you said adding temporal information to edges to represent the structure of a graph for a given date you can then traverse the structure of the graph back then.

    Reference node has a different meaning in Neo4j.

    Using category nodes per day (and linking them and also aggregating them for higher level timespans) is the more graphy way of categorizing nodes than indexed properties. (Effectively these are in-graph indices that you can easily include in your traversals and graph queries).

    You don't have to duplicate the nodes as long as you are only interested in different temporal structures. If your nodes are also different (e.g. changing properties, you could either duplicate them, and so effectively creating different subgraphs) or create a connected list of history nodes on each node that contain just the changes (or the full snapshot depending on your requirements).

    Your domain sounds very fitting for the graph database. If you have more and detailed questions feel free to join the Neo4j mailing list.

    这篇关于在Neo4J中存储多个图表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆