机依赖_write失败与错误EINVAL code [英] Machine dependent _write failures with EINVAL error code

查看:202
本文介绍了机依赖_write失败与错误EINVAL code的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这有实际的问题之前,一些冗长的背景,但是,它承担一些解释,希望能够淘汰一些红鲱鱼。

This has some lengthy background before the actual question, however, it bears some explaining to hopefully weed out some red herrings.

我们的应用程序,在Microsoft Visual C ++(2005)开发的,采用的是第三方库(其源$ C ​​$ C,我们幸运的是,正好有)出口在其他第三方应用程序使用的COM pressed文件。图书馆是负责创建导出的文件,管理数据和COM pression,一般处理所有错误。最近,我们开始得到的反馈,在某些机器上,我们的应用程序会写入文件时发生崩溃。基于一些初步的探索,我们能够确定以下内容:

Our application, developed in Microsoft Visual C++ (2005), uses a 3rd party library (whose source code we luckily happen to have) to export a compressed file used in another 3rd party application. The library is responsible for creating the exported file, managing the data and compression, and generally handling all errors. Recently, we began getting feedback that on certain machines, our application would crash during writes to the file. Based on some initial exploration, we were able to determine the following:


  • 的崩溃发生在各种硬件设置和操作系统(虽然我们的客户仅限于XP / 2000)

  • 的崩溃总是在同一组数据的发生;不过,他们不会对所有数据集发生

  • 对于一组导致崩溃的数据,飞机坠毁并非​​所有计算机上的重现性,甚至具有类似特征,即操作系统,RAM的大小等。

  • 的错误只会体现自己当应用程序在安装目录下运行 - 不可以从Visual Studio内置的时候,在调试模式下运行,甚至在用户必须访问其他目录中运行

  • 出现该问题的文件是否正在构建在本地或映射驱动器

  • The crashes happened on a variety of hardware setups and Operating Systems (although our customers are restricted to XP / 2000)
  • The crashes would always happen on the same set of data; however they would not occur on all sets of data
  • For a set of data that caused a crash, the crash is not reproducible on all machines, even with similar characteristics, i.e., operating system, amount of RAM, etc.
  • The bug would only manifest itself when the application was run in the installation directory - not when built from Visual Studio, run in debug mode, or even run in other directories that the user had access to
  • The issue occurs whether the file is being constructed on a local or a mapped drive

一旦调查这个问题,我们发现这个问题将在code(略作修改,以消除一些宏)以下块:

Upon investigating the problem, we found the issue to be in the following block of code (slightly modified to remove some macros):

 while (size>0) {
    do {
        nbytes = _write(file->fd, buf, size);
    } while (-1==nbytes && EINTR==errno);
    if (-1==nbytes) /* error */
        throw("file write failed")
    assert(nbytes>0);
    assert((size_t)nbytes<=size);
    size -= (size_t)nbytes;
    addr += (haddr_t)nbytes;
    buf = (const char*)buf + nbytes;
}

具体而言,在_write返回错误code 22,或者​​EINVAL。据 MSDN ,_write返回EINVAL意味着的缓冲液(在此情况下BUF)是空指针。然而,一些简单的检查围绕此函数验证这是不是在向它提出的任何电话的情况。

Specifically, the _write is returning error code 22, or EINVAL. According to MSDN, _write returning EINVAL implies that the buffer (buf in this case) is a null pointer. Some simple checks however around this function verified that this was not the case in any calls made to it.

我们这样做,不过,调用此方法有一些非常大的数据集 - 250MB以上的单呼,根据输入的数据。当我们强加给数据的跑到这个方法量的人造限制,我们似乎已经解决了这个问题。然而,这意味一个code修复的问题是依赖于机器/权限依赖/依赖于月球的阶段。所以,现在的问题:

We do, however, call this method with some very large sets of data - upwards of 250MB in a single call, depending on the input data. When we imposed an artificial limit on the amount of data that went to this method, we appear to have resolved the issue. This, however, smacks of a code fix for a problem that is machine dependent / permissions dependent / dependent on the phase of the moon. So now the questions:


  1. 是任何人都知道的极限数据_write量可以在一个单一的呼叫处理?或者 - 除非_write - 任何文件I / O命令支持由Visual C ++

  2. 因为这并不在所有的机器出现了 - 甚至在每次调用是一个足够的大小(一个呼叫250 MB会工作,会不会其他呼叫时) - 是任何人都知道的用户,计算机的组策略设置,或文件夹的权限会影响吗?

更新:
其他一些点,从职位至今:

UPDATE: A few other points, from the posts so far:


  • 我们做处理,其中大的缓冲区分配失败的案例。在第三方应用程序读取我们正在创建的文件性能方面的原因,我们希望所有的数据写出一大块(虽然给出这​​个错误,它可能无法)

  • 我们已经在常规检查尺寸的初始值以上,并且是相同的已分配的缓冲区的大小。此外,当EINVAL错误code升高,大小等于0,buf是不是一个空指针 - 这让我觉得这不是问题的原因

另一个更新:

一个失败的例子下面是与上面的code样品中一些方便的用printfs。

An example of a failure is below with some handy printfs in the code sample above.

     while (size>0) {
    if (NULL == buf)
    {
        printf("Buffer is null\n");
    }
    do {
        nbytes = _write(file->fd, buf, size);
    } while (-1==nbytes && EINTR==errno);
    if (-1==nbytes) /* error */
    {
        if (NULL == buf)
        {
            printf("Buffer is null post write\n");
        }
        printf("Error number: %d\n", errno);
        printf("Buffer address: %d\n", &buf);
        printf("Size: %d\n", size);
        throw("file write failed")
    }
    assert(nbytes>0);
    assert((size_t)nbytes<=size);
    size -= (size_t)nbytes;
    addr += (haddr_t)nbytes;
    buf = (const char*)buf + nbytes;
}

在一个失败,这将打印出:

On a failure, this will print out:

Error number: 22
Buffer address: 1194824
Size: 89702400

请注意,没有字节被成功写入缓冲区有一个有效的地址(没有NULL指针检查被触发,pre或交_write)

Note that no bytes were successfully written and that the buffer has a valid address (and no NULL pointer checks were triggered, pre or post _write)

最后更新

不幸的是,我们对事件的克服,无法决定性地解决这个问题。我们能够找到一些有趣的(甚至是令人不安的)事实。
1.错误只发生在与他们的硬盘速度较慢的写入时间机器。两台PC,具有完全相同的硬件规格,但具有不同的RAID配置(RAID 0与RAID 1)将有不同的结果。在RAID 0将正确地处理数据;在RAID 1将失败。同样,旧电脑速度较慢的硬盘也将失败;但类似处理器/内存 - - 更快的硬盘驱动器的PC newers会工作。
2.写大小要紧。当我们限制传递给_write为64 MB的数据的量,除了一个文件已成功。当把它限制为32 MB,所有文件成功。我们参加了我们使用该库的性能命中 - 这是该库的独立的_write或我们看到了问题的限制, - 但它是我们唯一的软件修正

Unfortunately, we were overcome by events and were not able to conclusively solve this. We were able to find some interesting (and maybe even disturbing) facts. 1. The errors only occurred on machines with slower write times on their hard disks. Two PCs, with the exact same hardware specs, but with different RAID configurations (RAID 0 versus RAID 1) would have different results. The RAID 0 would process the data correctly; the RAID 1 would fail. Similarly, older PCs with slower hard drives would also fail; newers PCs with faster hard drives - but similar processors / memory - would work. 2. The write size mattered. When we limited the amount of data passed to _write to be 64 MB, all but one file succeeded. When we limited it to 32 MB, all the files succeeded. We took a performance hit in the library we were using - which was a limitation of that library and independent of _write or the problem we were seeing - but it was our only "software" fix.

不幸的是,我从来没有得到一个很好的答案(和我们正要致电Microsoft对此,但我们必须让企业签署一份技术支持电话的费用),为什么EINVAL是在返回第一名。它不是 - 从我们能够找到 - C库API中的任何位置记录

Unfortunately, I never got a good answer (and we were about to call Microsoft on this, but we had to get business to sign off on the expense of a tech support call) as to why the EINVAL was being returned in the first place. It isn't - from what we were able to find - documented anywhere in the C library API.

如果真有人找到一个很好的答案,请在这里发表它,我将其标记为答案。我很想得到这个传奇的结论,即使它不再直接​​适用于我。

If anyone does find a good answer for this, please post it on here and I'll mark it as the answer. I'd love to get a conclusion for this saga, even if it no longer directly applies to me.

推荐答案

我们有我们设法重现很容易一个非常类似的问题。我们先编译下面的程序:

We had a very similar problem which we managed to reproduce quite easily. We first compiled the following program:

#include <stdlib.h>
#include <stdio.h>
#include <io.h>
#include <sys/stat.h>
#include <fcntl.h>

int main(int argc, char *argv[])
{ int len = 70000000;
  int handle= creat(argv[1], S_IWRITE | S_IREAD);
  setmode (handle, _O_BINARY);
  void *buf = malloc(len);
  int byteswritten = write(handle, buf, len);
  if (byteswritten == len)
    printf("Write successful.\n");
  else
    printf("Write failed.\n");
  close(handle);
  return 0;
}

现在,让我们说你是在电脑上MYCOMPUTER和C工作:\\收件箱中映射到共享文件夹\\\\ MYCOMPUTER \\收件箱。然后观察以下影响:

Now, let's say you are working on the computer mycomputer and that C:\inbox maps to a shared folder \\mycomputer\inbox. Then the observe the following effect:

C:\>a.exe C:\inbox\x
Write successful.

C:\>a.exe \\mycomputer\inbox\x
Write failed.

请注意,如果LEN改变为6000,是没有问题的...

Note that if len is changed to 60000000, there is no problem...

,我们认为,是一个(同样的效果已经观察到用的fwrite)操作系统的限制。我们的解决办法是尽量削减写在63 MB件,如果它失败。这个问题显然已经在Windows Vista纠正。

Based on this web page support.microsoft.com/kb/899149, we think it is a "limitation of the operating system" (the same effect has been observed with fwrite). Our work around is to try to cut the write in 63 MB pieces if it fails. This problem has apparently been corrected on Windows Vista.

我希望这有助于!
西蒙

I hope this helps! Simon

这篇关于机依赖_write失败与错误EINVAL code的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆