以特定格式(Python)编写和阅读字典 [英] Writing and reading a dictionary in specific format (Python)

查看:238
本文介绍了以特定格式(Python)编写和阅读字典的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对不起另一个新手查询:|为了建立这里给出的建议,
optimization

Sorry another newbie query :| To build upon the suggestion which was given here, optimizing

我需要能够逐步构建一个字典,即一个键:value在一个for循环中。具体来说,字典看起来像(N键,每个值都是列表的列表,较小的内部列表有3个元素):

I need to be able to incrementally build a dictionary i.e. one key: value at a time inside a for loop. To be specific, the dictionary would look something like (N keys, with each value being a list of lists. The smaller inner list has 3 elements):

dic_score ={key1:[ [,,], [,,], [,,] ...[,,] ], key2:[ [,,], [,,], [,,] ..[,,] ] ..keyN:[[,,], [,,], [,,] ..[,,]]}

此dic是从以下范例生成的,嵌套for循环。

This dic is being generated from the following paradigm, a nested for loop.

for Gnodes in G.nodes()       # Gnodes iterates over 10000 values 
    Gvalue = someoperation(Gnodes)
    for Hnodes in H.nodes()   # Hnodes iterates over 10000 values 
        Hvalue =someoperation(Hnodes)
        score = SomeOperation on (Gvalue,Hvalue)
        dic_score.setdefault(Gnodes,[]).append([Hnodes, score, -1 ])

然后我需要对这些列表进行排序,但这里给出的答案是优化(使用生成器表达式代替内部循环是一个选项)

[请注意,dic将包含每个10000个密钥与小列表的10000个元素相关联的键]

I then need to sort these lists, but the answer for that was given here, optimizing (use of generator expression in place of the inner loop is an option)
[Note that the dic would contain 10000 keys with each key associated with a 10000 elements of smaller lists]

由于循环计数器较大,因此生成的字典很庞大,而且我的内存不足。

Since the loop counters are big, the dictionary generated is huge and I am running out of memory.

如果写入,一旦生成文件,就会写入Key:值(列表列表),以便我不需要在内存中保存整个字典。然后,我想要以相同的格式读取字典,即像dic_score_after_reading [key]这样的东西,返回我正在寻找的列表。

How can I write the write the Key:value (a list of lists) as soon as it is generated to a file, so that I don't need to hold the entire dictionary in memory. I then want to be able to read back the dictionary in the same format i.e. something like dic_score_after_reading[key], returns me the list of list I am looking for.

我正在跳跃做这个写和阅读每个键:值将大大缓解内存要求。有没有更好的数据结构来做到这一点?我正在考虑一个数据库,可能就像Buzhug一样,这样可以灵活地访问和迭代与每个密钥相关的列表?

I am hopping that doing this writing and reading per key:value would considerably ease the memory requirements. Is there a better data structure to do this? Shall I be considering a database , probably like Buzhug, which would give me the flexibility to access and iterate over lists associated with each key ?

我正在使用 cPickle 转储整个字典,然后通过load()读取它,但是cPickle在转储这样大的字符时崩溃数据一次性抱歉,但我不知道做这种类型的东西的最佳做法。

I am currently using cPickle to dump the entire dictionary and then reading it back via load(), but cPickle crashes while dumping such a big data in one go.

谢谢!

推荐答案

您可以查看使用 ZODB 结合包含的 BTrees 实现。

You could look into using the ZODB in combination with the included BTrees implementation.

给出的是一个类似映射的结构,将各个条目单独写入对象商店。您需要使用保存点或简单的事务将数据刷新到存储,但可以处理大量的数据这样。

What that gives is a mapping-like structure that writes individual entries separately to the object store. You'd need to use savepoints or plain transactions to flush data out to the storage, but you can handle huge amounts of data this way.

这篇关于以特定格式(Python)编写和阅读字典的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆