内存中的大图 [英] Big graph in memory

查看:60
本文介绍了内存中的大图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想将所有使用的端口记录在巨大的pcap中.有65535个端口可用,并且每个端口都可以与其他端口通信: 总共65535 x 65535链接

I want to record all used ports within huge pcaps. There are 65535 ports available, and each port is able to talk each other port: 65535 x 65535 links in total

矩阵将非常稀疏(许多0项). 另外,我认为不必定向边缘,因此可以将Port1-> Port2添加到Port2-> Port1(这将我们的值数量减少为65535 * 65536/2). 您将如何使用python存储此内容?在numpy中?估计的内存消耗量是多少?

The matrix will be very sparse (many 0 entries). Additionally, I think the edges don't have to be directed, so Port1->Port2 may be added to Port2->Port1 (which reduces our amount of values to 65535 * 65536 / 2). How would you store this using python? In numpy? What will be the estimated amount of memory consumption for this?

然后,我想找到一个端口的最大和,然后pop()它(整个行和一列).这意味着,我想找到例如该端口1使用了500次(从端口2到端口1的100倍,从端口3到端口1的300倍,从端口4到端口1的300倍)...

Afterwards, I want to find the greatest sum for one port and pop() it (the whole row and column while). This means, i want to find e.g. that Port1 was used 500 times (100 times from Port2 to Port1, 300 times from Port3 to Port1, Port4 to Port1 100times)...

以图形方式来说,我希望有65535个可以相互连接的节点.然后,我想找到连接边上具有最高值总和的节点.之后,我要弹出节点(并删除相应的边,这将减少其他节点的总和).

Graphically spoken, I want to have 65535 nodes that could be connected with each other. Then I want to find the node that has the highest sum of values on connected edges. Afterwards, I want to pop the node (and delete the corresponding edges, which will decrease the sum of other nodes).

谢谢!

推荐答案

在Python中,根据稀疏的稀疏程度,dict-of-dicts可以很好地解决这一问题.

In Python, and depending on how sparse is sparse, a dict-of-dicts will handle this quite well.

connections = { ..., 8080: { 4545:17, 20151:3, ...}, ...}

如果我了解您的操作正确,则与端口p的连接数为

If I have understood what you are doing correctly, then the count of connections to port p is

count = sum( connections[8080].values() )

卸下端口p是

del connections[p]
for conn in connections.values():  # edit, bug fixed.
    if p in conn: 
         del conn[p]

如果您想通过仅存储一半对来节省内存,那么简单性会受到很大影响.

If you want to try to save memory by storing only half the pairs, then simplicity suffers greatly.

这篇关于内存中的大图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆