巨大图 [英] Huge Graph

查看:80
本文介绍了巨大图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述


我有一个应用程序,需要从数据库中读取记录并从中创建用户图.
每个节点将具有一些属性,并将连接到具有某些边缘权重的其他一些节点,这些权重会根据分析类型而变化.节点数量非常大(即超过一千万).我尝试使用哈希表来表示一旦使用Storable创建的这些图,但是当我尝试加载它们时,系统内存不足.
我想知道在此类应用程序中表示图形的最佳方法是什么.如果将它们表示为磁盘文件,即每个节点都存储为单独的文件,则在这种情况下,我需要一种良好的分配算法来将它们放入文件系统的层次结构中.或者可以将它们表示在单个文件上,而不会占用太多内存.
请指教.谢谢

Hi,
I have an application where I need to read records from database and create a graph of users from them.
Each node will have some attributes and will be connected to some other nodes with some edge weights which can change depending upon the type of analysis.The number of nodes is extremely large (i.e. more than 10 million). I tried using hashes to represent these graphs once created using Storable but when I try to load them, the system goes out of memory.
I wanted to know what is the best possible way of representing graphs in such applications. Should they be represented as disk files i.e. each node stored as a separate file in which case I would need a good distributing algorithm for putting them into a hierarchy in filesystem. Or they can be represented on single files without killing too much memory.
Please enlighten. Thanks

推荐答案

我认为问题不在于如何存储数据-单个文件还是多个文件.但是问题是关于如何加载数据以及加载多少数据?我敢肯定,所有1000万个节点都不会一次可用,因此您需要以某种方式确定当前有用的节点.然后,开发一个仅动态加载必要节点的系统.您可能还必须开发一个系统,以预测下一个可能使用的数据并提前对其进行缓存.

-Saurabh
I think question is not about how to store the data - single file vs a number of files. But the question is about how to load the data and how much? I am sure all 10 million nodes won''t be useful at a time, so you need to somehow determine the nodes which are currently useful. Then develop a system to dynamically load only necessary nodes. You may also have to develop a system to predict which data might be used next and may cache it in advance.

-Saurabh


几年前,我参加了WPF项目,我们实现了一个临时过滤器,用户可以使用该过滤器来定位更具体的数据.这使得查询花费的时间更少,并且返回的数据更易于管理.

该过滤器的一部分是一个设置,该设置决定要返回多少记录.如果结果过滤器产生的结果超过指定数量的结果,则会弹出一条消息,通知他们其查询产生的结果超过所需结果的数量(并告诉他们找到了多少结果),并且只会显示它们他们指定的号码.这将使他们能够更好地指定更有针对性的过滤器.

例如,如果他们指定了一个搜索条件,例如"Smith"的姓氏,并将结果数设置为5,则数据库将始终找到5条以上的"Smith"记录,但他们将看到5条记录出现在树控件.筛选器仍在屏幕上,因此他们可以添加另一个搜索条件,以更好地筛选返回的结果.
I was part of a WPF project a couple of years ago, and we implemented an ad-hoc filter the user could use use to target more specific data. That made the queries take less time, and the returned data was more manageable.

Part of that filter was a setting that dictated how many records to return. If the resulting filter generated in more than the specified number of results, a message would pop-up informing them that their query generated more than the number of desired results (and would tell them how many results were found), and would only show them the number they specified. This would allow them to better specify a more targeted filter.

For instance, if they specified one search criteria, such as last name of "Smith", and set the number of results to 5, the database would invariable find many more than 5 "Smith" records, but they would see five records appear in the tree control. The filter was still on the screen, so they could add another search criteria that would better filter the returned results.


这篇关于巨大图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆