与EINVAL错误代码相关的机器依赖_write失败 [英] Machine dependent _write failures with EINVAL error code

查看:129
本文介绍了与EINVAL错误代码相关的机器依赖_write失败的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在实际的问题之前,这有一些冗长的背景,但是,它有一些解释,希望能够删除一些红色的鲱鱼。



我们的应用程序,在Microsoft Visual C ++ (2005),使用第三方库(我们幸运的源代码)导出在另一第三方应用程序中使用的压缩文件。该库负责创建导出的文件,管理数据和压缩,以及一般处理所有错误。最近,我们开始获得反馈,在某些机器上,我们的应用程序在写入文件时会崩溃。基于一些初步的探索,我们可以确定以下内容:




  • 崩溃发生在各种硬件设置和操作系统上我们的客户仅限于XP / 2000)

  • 崩溃总是发生在同一组数据上;然而,它们不会发生在所有数据集上。

  • 对于导致崩溃的一组数据,崩溃在所有机器上都不可重现,即使具有类似的特性,即操作系统, RAM的数量等。

  • 当从Visual Studio构建的应用程序运行在安装目录 - 时,该错误只会显示出来,运行在调试中模式,甚至可以在用户访问的其他目录中运行

  • 无论文件是在本地还是映射的驱动器上构建,都会出现问题



在调查问题后,我们发现问题出在以下代码块(稍微修改以删除某些宏):

  while(size> 0){
do {
nbytes = _write(file-> fd,buf,size);
} while(-1 == nbytes&& EINTR == errno);
if(-1 == nbytes)/ * error * /
throw(file write failed)
assert(nbytes> 0);
assert((size_t)nbytes <= size);
size - =(size_t)nbytes;
addr + =(haddr_t)nbytes;
buf =(const char *)buf + nbytes;
}

具体来说,_write正在返回错误代码22或EINVAL。根据 MSDN ,_write返回EINVAL意味着缓冲区(在这种情况下为buf)是空指针。然而,关于这个功能的一些简单的检查证实了在这样做的任何调用中并不是这样。



然而,我们用一些非常大的数据 - 根据输入数据,单次呼叫中可达250MB以上。当我们对这种方法的数据量施加了人为限制时,我们似乎已经解决了这个问题。然而,这是针对机器依赖/依赖于/取决于月相的问题的代码修复程序。所以现在的问题:


  1. 有人意识到_write可以在单个调用中处理数据的限制吗?或者 - 禁止_write - Visual C ++支持的任何文件I / O命令?

  2. 由于这不会发生在所有机器上 - 甚至每个呼叫都足够大(一次呼叫任何人都知道用户,机器,组策略设置或文件夹权限会影响到这一点吗?

更新:
从目前为止的帖子中还有其他几点:




  • 我们处理大型缓冲区分配失败的情况。为了在第三方应用程序中读取我们正在创建的文件的性能原因,我们想将所有数据写入一个大块(尽管给出了此错误,但可能无法实现)

  • 我们已经在上面的例程中检查了大小的初始值,它与分配的缓冲区的大小相同。另外,当EINVAL错误代码出现时,size等于0,buf不是一个空指针 - 这使我认为这不是问题的原因。



另一个更新:



下面是一些方便的printfs,

  while(size> 0){
if(NULL == buf)
{
printf(Buffer is null\\\
);
}
do {
nbytes = _write(file-> fd,buf,size);
} while(-1 == nbytes&& EINTR == errno);
if(-1 == nbytes)/ * error * /
{
if(NULL == buf)
{
printf(Buffer is null post write );
}
printf(错误号:%d\\\
,errno);
printf(缓冲区地址:%d\\\
,& buf);
printf(Size:%d\\\
,size);
throw(file write failed)
}
assert(nbytes> 0);
assert((size_t)nbytes <= size);
size - =(size_t)nbytes;
addr + =(haddr_t)nbytes;
buf =(const char *)buf + nbytes;
}

发生故障时,将打印出来:

 错误编号:22 
缓冲区地址:1194824
大小:89702400


最后更新



不幸的是,我们被事件克服,无法最终解决这个问题。我们能够找到一些有趣的(甚至令人不安的)事实。
1.错误只发生在硬盘上写入时间较慢的机器上。两台PC具有完全相同的硬件规格,但具有不同的RAID配置(RAID 0与RAID 1)会有不同的结果。 RAID 0将正确处理数据; RAID 1将失败。同样,较硬的硬盘驱动器的PC也会失败;更新的PC具有更快的硬盘驱动器 - 但类似的处理器/内存 - 将工作。
2.写入大小有关。当我们将传递给_write的数据量限制为64 MB时,除一个文件之外的所有文件都成功。当我们将其限制为32 MB时,所有文件都成功。我们在我们使用的图书馆中表现出色,这是图书馆的限制,独立于_write或我们看到的问题 - 但这是我们唯一的软件修复。



不幸的是,我从来没有得到一个很好的答案(我们即将打电话给微软,但是我们必须要花费技术支持电话费用来签字),为什么EINVAL被退回首先。它不是 - 从我们能够找到 - 记录在C库API中的任何地方。



如果有人找到一个很好的答案,请发布在这里我会把它标记为答案。我很想得到这个传奇的结论,即使它不再直接适用于我。

解决方案

我们有一个非常类似的问题,我们设法复制很容易。我们首先编译了以下程序:

  #include< stdlib.h> 
#include< stdio.h>
#include< io.h>
#include< sys / stat.h>
#include< fcntl.h>

int main(int argc,char * argv [])
{int len = 70000000;
int handle = creat(argv [1],S_IWRITE | S_IREAD);
setmode(handle,_O_BINARY);
void * buf = malloc(len);
int byteswritten = write(handle,buf,len);
if(byteswritten == len)
printf(Write successful.\\\
);
else
printf(Write failed.\\\
);
关闭(句柄);
return 0;
}

现在,我们假设你正在使用计算机上的计算机,而C: \\inbox映射到共享文件夹\\mycomputer\inbox。然后观察以下效果:

  C:\> a.exe C:\inbox\x 
写成功。

C:\> a.exe \\mycomputer\inbox\x
写入失败。

请注意,如果len更改为60000000,则没有问题...



根据此网页 support.microsoft.com/kb/899149我们认为这是一个操作系统的限制(与fwrite相同的效果)。我们的工作是尝试在63 MB的文件中剪切失败。这个问题显然已经在Windows Vista上得到解决。



希望这有帮助!
Simon


This has some lengthy background before the actual question, however, it bears some explaining to hopefully weed out some red herrings.

Our application, developed in Microsoft Visual C++ (2005), uses a 3rd party library (whose source code we luckily happen to have) to export a compressed file used in another 3rd party application. The library is responsible for creating the exported file, managing the data and compression, and generally handling all errors. Recently, we began getting feedback that on certain machines, our application would crash during writes to the file. Based on some initial exploration, we were able to determine the following:

  • The crashes happened on a variety of hardware setups and Operating Systems (although our customers are restricted to XP / 2000)
  • The crashes would always happen on the same set of data; however they would not occur on all sets of data
  • For a set of data that caused a crash, the crash is not reproducible on all machines, even with similar characteristics, i.e., operating system, amount of RAM, etc.
  • The bug would only manifest itself when the application was run in the installation directory - not when built from Visual Studio, run in debug mode, or even run in other directories that the user had access to
  • The issue occurs whether the file is being constructed on a local or a mapped drive

Upon investigating the problem, we found the issue to be in the following block of code (slightly modified to remove some macros):

 while (size>0) {
    do {
        nbytes = _write(file->fd, buf, size);
    } while (-1==nbytes && EINTR==errno);
    if (-1==nbytes) /* error */
        throw("file write failed")
    assert(nbytes>0);
    assert((size_t)nbytes<=size);
    size -= (size_t)nbytes;
    addr += (haddr_t)nbytes;
    buf = (const char*)buf + nbytes;
}

Specifically, the _write is returning error code 22, or EINVAL. According to MSDN, _write returning EINVAL implies that the buffer (buf in this case) is a null pointer. Some simple checks however around this function verified that this was not the case in any calls made to it.

We do, however, call this method with some very large sets of data - upwards of 250MB in a single call, depending on the input data. When we imposed an artificial limit on the amount of data that went to this method, we appear to have resolved the issue. This, however, smacks of a code fix for a problem that is machine dependent / permissions dependent / dependent on the phase of the moon. So now the questions:

  1. Is anyone aware of a limit on the amount of data _write can handle in a single call? Or - barring _write - any file I/O command support by Visual C++?
  2. Since this does not occur on all machines - or even on every call that is a sufficient size (one call with 250 MB will work, another call will not) - is anyone aware of user, machine, group policy settings, or folder permissions that would affect this?

UPDATE: A few other points, from the posts so far:

  • We do handle the cases where the large buffer allocation fails. For performance reasons in the 3rd party application that reads the file we're creating, we want to write all the data out in one big block (although given this error, it may not be possible)
  • We have checked the initial value of size in the routine above, and it is the same as the size of the buffer that was allocated. Also, when the EINVAL error code is raised, size is equal to 0, and buf is not a null pointer - which makes me think that this isn't the cause of the problem.

Another Update:

An example of a failure is below with some handy printfs in the code sample above.

     while (size>0) {
    if (NULL == buf)
    {
        printf("Buffer is null\n");
    }
    do {
        nbytes = _write(file->fd, buf, size);
    } while (-1==nbytes && EINTR==errno);
    if (-1==nbytes) /* error */
    {
        if (NULL == buf)
        {
            printf("Buffer is null post write\n");
        }
        printf("Error number: %d\n", errno);
        printf("Buffer address: %d\n", &buf);
        printf("Size: %d\n", size);
        throw("file write failed")
    }
    assert(nbytes>0);
    assert((size_t)nbytes<=size);
    size -= (size_t)nbytes;
    addr += (haddr_t)nbytes;
    buf = (const char*)buf + nbytes;
}

On a failure, this will print out:

Error number: 22
Buffer address: 1194824
Size: 89702400

Note that no bytes were successfully written and that the buffer has a valid address (and no NULL pointer checks were triggered, pre or post _write)

LAST UPDATE

Unfortunately, we were overcome by events and were not able to conclusively solve this. We were able to find some interesting (and maybe even disturbing) facts. 1. The errors only occurred on machines with slower write times on their hard disks. Two PCs, with the exact same hardware specs, but with different RAID configurations (RAID 0 versus RAID 1) would have different results. The RAID 0 would process the data correctly; the RAID 1 would fail. Similarly, older PCs with slower hard drives would also fail; newers PCs with faster hard drives - but similar processors / memory - would work. 2. The write size mattered. When we limited the amount of data passed to _write to be 64 MB, all but one file succeeded. When we limited it to 32 MB, all the files succeeded. We took a performance hit in the library we were using - which was a limitation of that library and independent of _write or the problem we were seeing - but it was our only "software" fix.

Unfortunately, I never got a good answer (and we were about to call Microsoft on this, but we had to get business to sign off on the expense of a tech support call) as to why the EINVAL was being returned in the first place. It isn't - from what we were able to find - documented anywhere in the C library API.

If anyone does find a good answer for this, please post it on here and I'll mark it as the answer. I'd love to get a conclusion for this saga, even if it no longer directly applies to me.

解决方案

We had a very similar problem which we managed to reproduce quite easily. We first compiled the following program:

#include <stdlib.h>
#include <stdio.h>
#include <io.h>
#include <sys/stat.h>
#include <fcntl.h>

int main(int argc, char *argv[])
{ int len = 70000000;
  int handle= creat(argv[1], S_IWRITE | S_IREAD);
  setmode (handle, _O_BINARY);
  void *buf = malloc(len);
  int byteswritten = write(handle, buf, len);
  if (byteswritten == len)
    printf("Write successful.\n");
  else
    printf("Write failed.\n");
  close(handle);
  return 0;
}

Now, let's say you are working on the computer mycomputer and that C:\inbox maps to a shared folder \\mycomputer\inbox. Then the observe the following effect:

C:\>a.exe C:\inbox\x
Write successful.

C:\>a.exe \\mycomputer\inbox\x
Write failed.

Note that if len is changed to 60000000, there is no problem...

Based on this web page support.microsoft.com/kb/899149, we think it is a "limitation of the operating system" (the same effect has been observed with fwrite). Our work around is to try to cut the write in 63 MB pieces if it fails. This problem has apparently been corrected on Windows Vista.

I hope this helps! Simon

这篇关于与EINVAL错误代码相关的机器依赖_write失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆