如何存储大型地图的数据< string,int>? [英] How to store the data of a large map<string, int>?

查看:96
本文介绍了如何存储大型地图的数据< string,int>?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

大家好。

我有一个类似字典的文件,格式如下:

first 4

第7列

是9

a 23

字134

....


每行有两列。第一列总是英文

字,第二列是整数。该文件是一个文本文件,

并包含数万行。我的程序必须读取这个

文件,我使用容器映射< string,int来存储数据。

每次我的程序运行时,都会读取文件。但由于文件太大,因此速度非常慢。有没有其他有效的方法

组织文件的数据,使其快速阅读?

任何帮助表示赞赏。谢谢。

Hi, all.
I have a dictionary-like file which has the following format:
first 4
column 7
is 9
a 23
word 134
....

Every line has two columns. The first column is always an English
word, and the second is an integer number. The file is a text file,
and contains tens of thousands of lines. My program has to read this
file, and I use the container map<string, intto store the data.
Every time my program runs, the file is read. But since the file is
too large, the speed is very slow. Is there any other efficient way to
organize the data of the file to make it fast to read?
Any help is appreciated. Thank you.

推荐答案

liujiaping写道:
liujiaping wrote:

全部。

我有一个类似字典的文件,格式如下:

first 4

第7列

是9

a 23

字134

....


每行有两列。第一列总是英文

字,第二列是整数。该文件是一个文本文件,

并包含数万行。我的程序必须读取这个

文件,我使用容器映射< string,int来存储数据。

每次我的程序运行时,都会读取文件。但由于文件太大,因此速度非常慢。有没有其他有效的方法

组织文件的数据,使其快速阅读?

任何帮助表示赞赏。谢谢。
Hi, all.
I have a dictionary-like file which has the following format:
first 4
column 7
is 9
a 23
word 134
....

Every line has two columns. The first column is always an English
word, and the second is an integer number. The file is a text file,
and contains tens of thousands of lines. My program has to read this
file, and I use the container map<string, intto store the data.
Every time my program runs, the file is read. But since the file is
too large, the speed is very slow. Is there any other efficient way to
organize the data of the file to make it fast to read?
Any help is appreciated. Thank you.



您如何阅读数据以及您当前的瓶颈在哪里?


-

Ian Collins。

How are you reading the data and where is your current bottleneck?

--
Ian Collins.


liujiaping写道:
liujiaping wrote:

全部。

我有一个类似字典的文件,格式如下:

第4页

第7列

是9

a 23

字134

...


每行有两列。第一列总是英文

字,第二列是整数。该文件是一个文本文件,

并包含数万行。我的程序必须读取这个

文件,我使用容器映射< string,int来存储数据。

每次我的程序运行时,都会读取文件。但由于文件太大,因此速度非常慢。有没有其他有效的方法

组织文件的数据,使其快速阅读?

任何帮助表示赞赏。谢谢。
Hi, all.
I have a dictionary-like file which has the following format:
first 4
column 7
is 9
a 23
word 134
...

Every line has two columns. The first column is always an English
word, and the second is an integer number. The file is a text file,
and contains tens of thousands of lines. My program has to read this
file, and I use the container map<string, intto store the data.
Every time my program runs, the file is read. But since the file is
too large, the speed is very slow. Is there any other efficient way to
organize the data of the file to make it fast to read?
Any help is appreciated. Thank you.



我认为您必须将文本文件转换为二进制模式,构建为

字典索引文件。


您可以使用这样的结构将数据序列化到字典文件中


struct Foo {

unsigned int word_len;

char * word;

int key;

};


和Foo对象的索引一个整数值,所以你可以快速搜索它,就像哈希一样,建立索引文件。


你可以参考StarDict,一个开源词典,它会给你

一些提示。

I think you have to convert your text file into binary mode, built as a
dictionary indexed file.

You can have such structure to serialize your data into the dictionary file

struct Foo {
unsigned int word_len;
char* word;
int key;
};

and index to Foo object into an integral value so you can search it
fast, like hashing, to build a index file.

You can reference StarDict, an open source dictionary, it will give you
some hints.


8月6日上午11点25分,Ian Collins< ian-n ... @ hotmail .comwrote:
On Aug 6, 11:25 am, Ian Collins <ian-n...@hotmail.comwrote:

liujiaping写道:
liujiaping wrote:

全部。

我有一个类似字典的文件,格式如下:

first 4

第7列

是9

a 23

字134

....
Hi, all.
I have a dictionary-like file which has the following format:
first 4
column 7
is 9
a 23
word 134
....


每行有两列。第一列总是英文

字,第二列是整数。该文件是一个文本文件,

并包含数万行。我的程序必须读取这个

文件,我使用容器映射< string,int来存储数据。

每次我的程序运行时,都会读取文件。但由于文件太大,因此速度非常慢。有没有其他有效的方法

组织文件的数据,使其快速阅读?

任何帮助表示赞赏。谢谢。
Every line has two columns. The first column is always an English
word, and the second is an integer number. The file is a text file,
and contains tens of thousands of lines. My program has to read this
file, and I use the container map<string, intto store the data.
Every time my program runs, the file is read. But since the file is
too large, the speed is very slow. Is there any other efficient way to
organize the data of the file to make it fast to read?
Any help is appreciated. Thank you.



您如何阅读数据以及您当前的瓶颈在哪里?


-

伊恩柯林斯。


How are you reading the data and where is your current bottleneck?

--
Ian Collins.



我只是使用文件流来读取数据,就像这样:


fstream data_file(" data。 txt");

字符串字;

int key;

map< string,intkeymap;

while( !data_file.eof())

{

data_file> word;

data_file> key;

key_map [word] = key;

}

I just use the file stream to read the data, simply like this:

fstream data_file("data.txt");
string word;
int key;
map<string, intkeymap;
while(!data_file.eof())
{
data_file >word;
data_file >key;
key_map[word] = key;
}


这篇关于如何存储大型地图的数据&lt; string,int&gt;?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆