解析巨大的字符*缓冲区的问题 [英] Problem With Parsing Huge Char* Buffer

查看:48
本文介绍了解析巨大的字符*缓冲区的问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



我有一些巨大的csv文件并尝试读取文件,解析并写入一个数据库。

csv格式如下:

<前lang =xml> EUR / USD 20100103 21:27:59.694 1.43067 1.43097
EUR / USD 20100103 21:27:59.732 1.43075 1.43095
EUR / USD 20100103 21:28:08.152 1.43078 1.43099
EUR / USD 20100103 21:28:16.897 1.43076 1.43102
EUR / USD 20100103 21:28:28.757 1.43071 1.43101
EUR / USD 20100103 21:29:07.659 1.43071 1.43106





我的C ++代码是:

< pre lang =   cs> void Read( char  * fileName ){
FILE * pFile;
long lSize;
char * buffer;
size_t 结果;

pFile = fopen(fileName, rb);
if (pFile == NULL){fputs( 文件错误,stderr); exit( 1 );}

fseek(pFile, 0 ,SEEK_END) ;
lSize = ftell(pFile);
cout<< lSize<< \ n;
倒带(pFile);

// 分配内存以包含整个文件:
buffer =( char *)malloc( sizeof char < /跨度>)* lSize所);
if (buffer == NULL){fputs( 内存错误,stderr); exit( 2 );}

result = fread(buffer, 1 , lSize所,PFILE);
if (result!= lSize){fputs( 读错误,stderr);退出( 3 );}

char * pch;
pch = strtok(buffer, \ n);

while (pch!= NULL){
df<< pch<< \ n;
解析(pch);
pch = strtok(NULL, \ n);
}

免费(缓冲);
fclose(pFile);
}

void 解析( char * tmpp){
char * pch;
pch = strtok(tmpp, );

char temp [ 256 ];
strcpy(temp, INSERT INTO dukas VALUES(');
while (pch!= NULL){
cout<< pch<< < span class =code-string> \ n
;
pch = strtok(NULL, );

}


}





问题是这个文件,只读取第一行而不能移动到第二行。

这个脚本中的任何错误?

问候,

解决方案

为什么你要将整个文件一次加载到内存中,当你在下面的几行中对它进行标记时。这纯属无稽之谈,极其浪费资源。只需逐行阅读。网上有数百个样本。顺便说一下,这种格式不是CSV格式。


Hi,
I have some huge csv files and try read files, parsing and write in one DB.
csv format is same below:

EUR/USD	20100103 21:27:59.694	1.43067	1.43097
EUR/USD	20100103 21:27:59.732	1.43075	1.43095
EUR/USD	20100103 21:28:08.152	1.43078	1.43099
EUR/USD	20100103 21:28:16.897	1.43076	1.43102
EUR/USD	20100103 21:28:28.757	1.43071	1.43101
EUR/USD	20100103 21:29:07.659	1.43071	1.43106



And my C++ code is:

<pre lang="cs">void Read(char* fileName){
  FILE * pFile;
        long lSize;
        char * buffer;
        size_t result;

        pFile = fopen (fileName , "rb" );
        if (pFile==NULL) {fputs ("File error",stderr); exit (1);}

        fseek (pFile , 0 , SEEK_END);
        lSize = ftell (pFile);
        cout<<lSize<<"\n";
        rewind (pFile);

        // allocate memory to contain the whole file:
        buffer = (char*) malloc (sizeof(char)*lSize);
        if (buffer == NULL) {fputs ("Memory error",stderr); exit (2);}

        result = fread (buffer,1,lSize,pFile);
        if (result != lSize) {fputs ("Reading error",stderr); exit (3);}

                char * pch;
        pch = strtok (buffer,"\n");

        while (pch != NULL) {
                df<<pch<<"\n";
            Parsing(pch);
                pch = strtok (NULL, "\n");
        }

        free (buffer);
        fclose (pFile);
}

void Parsing(char* tmpp){
    char * pch;
    pch = strtok (tmpp,",");

    char temp[256];
        strcpy(temp, "INSERT INTO dukas  VALUES('" );
    while (pch != NULL) {
        cout<<pch<<"\n";
        pch = strtok (NULL, ",");

    }


}



Problem is that this file, only read first line and cant move to second line.
Any bugs in this script?
Regards,

解决方案

Why on earth do you want to load the whole file in the memory at once, when you tokenize it in a few rows below. It is pure nonsense and an extreme waste of resources. Simply read it row by row. There are hundreds of samples on the web. And by the way, that format is not CSV.


这篇关于解析巨大的字符*缓冲区的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆