非常大的字典 [英] very large dictionary

查看:71
本文介绍了非常大的字典的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

你好,


我试图在拥有128G内存
的服务器上加载6.8G大字典。我收到了内存错误。我使用Python 2.5.2。如何加载我的

数据?


SImon

Hello,

I tried to load a 6.8G large dictionary on a server that has 128G of
memory. I got a memory error. I used Python 2.5.2. How can I load my
data?

SImon

推荐答案

On 2008年8月1日星期五00:46:09 -0700,Simon Strobl写道:
On Fri, 01 Aug 2008 00:46:09 -0700, Simon Strobl wrote:

我试图在128G的服务器上加载一个6.8G的大字典

内存。我收到了内存错误。我使用Python 2.5.2。如何加载我的

数据?
I tried to load a 6.8G large dictionary on a server that has 128G of
memory. I got a memory error. I used Python 2.5.2. How can I load my
data?



什么是加载字典意思?它是用`pickle`

模块保存的吗?


使用数据库而不是字典怎么样?


Ciao,

Marc''BlackJack''Rintsch

What does "load a dictionary" mean? Was it saved with the `pickle`
module?

How about using a database instead of a dictionary?

Ciao,
Marc ''BlackJack'' Rintsch


什么是加载词典是什么意思?


我有一个文件bigrams.py,内容如下:


bigrams = {

,djy :75,

",djz" :57,

",djzoom" :165,

",dk" :28893,

",dk.au" :854,

",dk.b。" :3668,

....


}


在另一个文件中我说:


来自bigrams import bigrams
What does "load a dictionary" mean?

I had a file bigrams.py with a content like below:

bigrams = {
", djy" : 75 ,
", djz" : 57 ,
", djzoom" : 165 ,
", dk" : 28893 ,
", dk.au" : 854 ,
", dk.b." : 3668 ,
....

}

In another file I said:

from bigrams import bigrams

如何使用数据库而不是字典?
How about using a database instead of a dictionary?



如果没有其他办法,我将不得不学习如何在Python中使用

数据库。不过,我希望能够使用相同类型的

脚本以及各种大小的数据。

If there is no other way to do it, I will have to learn how to use
databases in Python. I would prefer to be able to use the same type of
scripts with data of all sizes, though.


Simon Strobl:
Simon Strobl:

我有一个文件bigrams.py,内容如下:

bigrams = {

",djy" :75,

",djz" :57,

",djzoom" :165,

",dk" :28893,

",dk.au" :854,

",dk.b。" :3668,

...

}

在另一个文件中我说:

来自bigrams import bigrams
I had a file bigrams.py with a content like below:
bigrams = {
", djy" : 75 ,
", djz" : 57 ,
", djzoom" : 165 ,
", dk" : 28893 ,
", dk.au" : 854 ,
", dk.b." : 3668 ,
...
}
In another file I said:
from bigrams import bigrams



这里的模块大小可能有限制。您可以尝试在磁盘上更改数据格式,创建如下文本文件:

",djy" 75

",djz" 57

",djzoom" 165

....

然后在一个模块中你可以创建一个空的字典,用以下内容读取

数据的行:

for somefile中的行:

part,n = .rsplit("",1)

somedict [part.strip(''' ;'')] = int(n)


否则你可能需要使用BigTable,DB等。

Probably there''s a limit in the module size here. You can try to
change your data format on disk, creating a text file like this:
", djy" 75
", djz" 57
", djzoom" 165
....
Then in a module you can create an empty dict, read the lines of the
data with:
for line in somefile:
part, n = .rsplit(" ", 1)
somedict[part.strip(''"'')] = int(n)

Otherwise you may have to use a BigTable, a DB, etc.


如果没有其他办法,我将不得不学习如何在Python中使用

数据库。不过,我希望能够使用相同类型的

脚本和所有大小的数据。
If there is no other way to do it, I will have to learn how to use
databases in Python. I would prefer to be able to use the same type of
scripts with data of all sizes, though.



我明白,我不知道是否有64位Python的

dicts的文件限制。 />

再见,

熊宝宝

I understand, I don''t know if there are documented limits for the
dicts of the 64-bit Python.

Bye,
bearophile


这篇关于非常大的字典的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆