c代码:fprintf在文件上打印的次数少于预期的时间 [英] c code: fprintf print on file less times than expected

查看:65
本文介绍了c代码:fprintf在文件上打印的次数少于预期的时间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在10个不同的文件夹中同时运行了10个相同代码的副本(仅更改了一些参数)(每个程序在单核中运行).
在每个程序中,我都有一个for循环(迭代次数NumSamples可以是50000、500000或5000000(迭代次数取决于单个迭代的具体执行时间;在更快的情况下执行更多的迭代))).在每次迭代中,我都会计算一定量(双精度变量),然后使用(在for块内)将其保存在文件中:
fprintf(fp_TreeEv,%f \ n",TreeEv);
其中TreeEv是在每个循环中计算的变量的名称.
为了确保代码在fprintf命令之后立即保存变量,我在文件打开后将缓冲区设置为0,

I ran 10 copy of the same code (changing only some parameters) simultaneously in 10 different folder (each program working in single core).
In each program I have a for loop (the number of iterations, NumSamples, can be 50000, 500000 or 5000000 (the number of iteration depend on the specific time of execution of a single iteration; A greater number of iteration are performed in quicker cases )). Inside each iteration I compute a certain quantity (a double variable) and then save it on file with (inside the for block):
fprintf(fp_TreeEv, "%f\n", TreeEv);
where TreeEv is the name of the variable computed at each cycle.
To be sure that the code save the variable right after the fprintf command I set the buffer to 0 after the file opening, with:

TreeEv = fopen("data.txt","a");
setbuf(TreeEv,NULL);
程序结束,没有错误.我也知道所有NumSamples迭代都在执行过程中完成(我在for循环的开头打印了一个初始化为0的变量,并在每个循环中将其递增1).
当我打开txt文件时,在代码执行结束时,我看到文件内的行(数据)比预期的要少(NumSamples),例如4996187而不是5000000(我还检查了csv文件是否丢失了与 tr -d -c相同的数据量,< filename.csv | wc -c ,如Barmar所建议的那样)
问题的根源可能是什么?

TreeEv=fopen("data.txt", "a");
setbuf(TreeEv, NULL);
The program ends without error. Also I know that all the NumSamples iterations have been done during the execution (I print a variable initialized with 0 at the beginning of the for loop and that increase by one at each cycle).
When I open the txt file, at the ending of code execution I see that inside the file there are less row (data) than expected (NumSamples), for example 4996187 instead of 5000000 (I've also checked that csv file miss the same amount of data with tr -d -c , < filename.csv | wc -c, as suggested by Barmar)
What could be the source of the problem?

我在代码的for循环下面复制(inv_trasf只是一个生成随机数的函数):\

I copy below the for loop of my code(inv_trasf is just a function that generate random numbers):\

    char NomeFiletxt [64];
    char NomeFilecsv [64];
    sprintf(NomeFilecsv, "TreeEV%d.csv", N);
    sprintf(NomeFiletxt, "TreeEVCol%d.txt", N);
    FILE* fp_TreeEv;
    fp_TreeEv=fopen(NomeFilecsv, "a");
    FILE* fp_TreeEvCol;
    fp_TreeEvCol=fopen(NomeFiletxt, "a");
    setbuf(fp_TreeEv, NULL);
    setbuf(fp_TreeEvCol, NULL);

    for(ciclo=0; ciclo<NumSamples; ciclo++){

        sum = 0;
        sum_2=0;
        k_max_int=0;
        for(i=0; i<N; i++){
            deg_seq[i]=inv_trasf(k_max_double_round); 
            sum+=deg_seq[i];
            sum_2+=(deg_seq[i]*deg_seq[i]);

            if(deg_seq[i]>k_max_int){
                k_max_int = deg_seq[i];
            }   
        }


        if((sum%2)!=0){
            do{
                v=rand_int(N);
            }while(deg_seq[v]==k_max_int);
            deg_seq[v]++;
            sum++;
            sum_2+=(deg_seq[v]*deg_seq[v]-(deg_seq[v]-1)*(deg_seq[v])-1);

        }
        TreeEV = ((double)sum_2/sum)-1.;
        fprintf(fp_TreeEv, "%f,", TreeEV);
        fprintf(fp_TreeEvCol, "%f\n", TreeEV);



    CycleCount +=1;
    }
    fclose(fp_TreeEv);
    fclose(fp_TreeEvCol);

难道问题仍然存在于特定时间内无法遵循代码且无法回叫数据(由于存在Null缓冲区)的ssd中吗?
只有执行时间更长的代码(在单周期迭代中)才能正确保存所有预期数据.来自 man setbuf ",如果参数buf为NULL,则仅影响模式;否则,将影响模式.新缓冲区将在下一个读取或写入操作中分配."

Could the problem stay in the ssd that, in a given moment, failed to follow the codes and wasn't able to call back data (for the presence of Null buffer)?
Only codes with greater execution time (on the single cycle iteration) save correctly all the expected data. From man setbuf "If the argument buf is NULL, only the mode is affected; a new buffer will be allocated on the next read or write operation."

[ EDIT ]
我注意到,其中一个txt文件已损坏"有一个无效值:

[EDIT]
I noticed that inside one of the txt file "corrupted" there is one invalid value:

我检查了它是唯一具有不同数字位数的值;为此,我首先删除了所有的.".使用 tr -d \从txt文件中删除.<TreeEVCol102400.txt>test.txt ,然后使用 sort -g test.txt>对新文件进行排序t.dat .之后,很容易在t.dat文件的顶部/结尾检查数字多/少的值.
我还检查了它是否是文件中唯一具有至少2个"的值.与:
grep -ni'\.* [0-9] * \.[0-9] * \.'TreeEVCol102400.txt 我检查了每个损坏的文件是否只有一个这种无效值.

I checked that it is the only value with a different number of digit; to do so I first removed all the "." from the txt file with tr -d \. < TreeEVCol102400.txt > test.txt and then sorted the new file with sort -g test.txt > t.dat. After that is easy to check at the top/end of t.dat file for values with more/less digit.
I've also checked that it is the only value in the file with at least 2 "." with:
grep -ni '\.*[0-9]*\.[0-9]*\.' TreeEVCol102400.txt I checked that each corrupted files has just one invalid values of this kind.

推荐答案

在FILE上设置非缓冲"模式时,每个字符可能 1 分别写入底层设备/文件.在那种情况下,如果多个进程附加到同一文件,则如果两个进程都尝试同时写入一个数字,则它们可能会交错数字,如您所显示的损坏的"消息所示.价值.因此,通常不需要您在文件上设置_IONBF模式.

When you set 'unbuffered' mode on a FILE, each character might1 be written individually to the underlying device/file. In that case, if mulitple processes are appending to the same file, if both try to write a number at the same time, they might get their digits interleaved, as you show with your "corrupted" value. So setting _IONBF mode on a file is generally not what you want.

1 实际上不可能完全取消缓冲文件-操作系统中仍将包含磁盘缓冲区,并且可能仍然是(小的)stdio缓冲区来处理某些文件极端案例.但通常,每个字符都可以单独写入文件中.

这篇关于c代码:fprintf在文件上打印的次数少于预期的时间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆