为什么重定向在管道失效的地方工作? [英] Why would redirection work where piping fails?

查看:142
本文介绍了为什么重定向在管道失效的地方工作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

理论上,这两个命令行应该是等效的:



1



 code> type tmp.txt | test.exe 



2



  test.exe< tmp.txt 

我有一个涉及#1的过程,多年来,在过去一年里的某个时刻,我们开始用较新版本的Visual Studio编译程序,并且由于格式不正确的输入而失败(见下文)。但是#2成功(没有异常,我们看到预期的输出)。为什么#2成功,#1失败?



我已经能够减少test.exe到下面的程序。我们的输入文件每行只有一个制表符,并统一使用CR / LF行尾。所以这个程序不应该写stderr:

  #include< iostream> 
#include< string>

int __cdecl main(int argc,char ** argv)
{
std :: istream * pIs =& std :: cin;
std :: string line;

int lines = 0;
while(!(pIs-> eof()))
{
if(!std :: getline(* pIs,line))
{
;
}

const char * pLine = line.c_str();
int tabs = 0;
while(pLine)
{
pLine = strchr(pLine,'\t');
if(pLine)
{
//移过标签
pLine ++;
tabs ++;
}
}

if(tabs> 1)
{
std :: cerr< 我们在<线<< good lines.\\\
;
lines = -1;
}

lines ++;
}

return 0;
}

当通过#1运行时,我得到以下输出,每次(在每种情况下,这是因为getline已经返回两个连接的行,没有中断linebreak);当通过#2运行时,有(正确)没有输出:

  
我们在1468个好线后失去了一个linebreak。
我们在20985个好线后输了一个linebreak。
我们在6982个好线后失去了linebreak。
我们在1150个好线后失去了linebreak。
我们在276条好线后失去了linebreak。
我们在12076个好线后输了一个linebreak。
我们在2072个好线后失去了linebreak。
我们在4576条好线后失去了linebreak。
我们在401个好线后失去了linebreak。
我们在6428条好线后失去了linebreak。
我们在7228个好线后失去了linebreak。
我们在931个好线后失去了linebreak。
我们在1240个好线后失去了一个linebreak。
我们在2432个好线后失去了一个linebreak。
我们在553条好线后失去了linebreak。
我们失去了一个linebreak后6550好线。
我们在1591个好线后输了一个linebreak。
我们在55条好线后失去了linebreak。
我们在2428线路后失去了linebreak。
我们在1475条好线后输了一个换线。
我们在3866个好线后输了一个linebreak。
我们在3000条好线后失去了linebreak。


解决方案

https://connect.microsoft.com/VisualStudio/feedback/details/1902345/regression-fread-on-a-pipe-drops-some-newlinesrel =nofollow>已知问题:


该错误实际上是在低级的_read函数,stdio
库函数(包括fread和fgets)用于从
文件描述符读取。



_read中的错误如下:If ...



<
  • 您正在阅读文本模式管道,

  • 您调用_read读取N个字节,

  • _read成功读取N字节和

  • 最后一个读取字节是回车符(CR)字符,

  • 那么_read函数将成功完成读取,但
    返回N-1而不是N.在
    结果缓冲区结尾处的CR或LF字符不会计入返回值。



    在这个错误报告的特定问题,fread调用_read填充
    流缓冲区。 _read报告它填充
    缓冲区的N-1个字节,并且最终的CR或LF字符丢失。



    该错误基本上对时序敏感,因为_read是否可以
    从管道中成功读取N个字节取决于写入了多少数据
    管道。改变缓冲区大小或改变当
    缓冲区被刷新可能会降低问题的可能性,但它
    将不一定能解决100%的情况下的问题。



    有几种可能的解决方法:


    1. 使用二进制管道并执行文本模式CRLF => LF在读者一侧。这不是特别难做的
      (扫描CRLF对缓冲区;用一个LF替换它们。)

    2. 使用_osfhnd(fh)调用ReadFile,绕过CRT的I / O库完全(虽然这也需要手动
      文本模式翻译,因为操作系统不会为
      你做文本模式翻译)



      1. 我们已经修复了这个bug,用于下一次更新Universal CRT。注意
        ,Universal CRT是一个操作系统组件,是
        独立于Visual C ++库服务。下一次更新
        到Universal CRT可能与今年夏天的
        Windows 10周年纪念更新周期相同。



    In theory, these two command-lines should be equivalent:

    1

    type tmp.txt | test.exe
    

    2

    test.exe < tmp.txt
    

    I have a process involving #1 that, for many years, worked just fine; at some point within the last year, we started to compile the program with a newer version of Visual Studio, and it now fails due to malformed input (see below). But #2 succeeds (no exception and we see expected output). Why would #2 succeed where #1 fails?

    I've been able to reduce test.exe to the program below. Our input file has exactly one tab per line and uniformly uses CR/LF line endings. So this program should never write to stderr:

    #include <iostream>
    #include <string>
    
    int __cdecl main(int argc, char** argv)
    {
        std::istream* pIs = &std::cin;
        std::string line;
    
        int lines = 0;
        while (!(pIs->eof()))
        {
            if (!std::getline(*pIs, line))
            {
                break;
            }
    
            const char* pLine = line.c_str();
            int tabs = 0;
            while (pLine)
            {
                pLine = strchr(pLine, '\t');
                if (pLine)
                {
                    // move past the tab
                    pLine++;
                    tabs++;
                }
            }
    
            if (tabs > 1)
            {
                std::cerr << "We lost a linebreak after " << lines << " good lines.\n";
                lines = -1;
            }
    
            lines++;
        }
    
        return 0;
    }
    

    When run via #1, I get the following output, with the same numbers every time (in each case, it's because getline has returned two concatenated lines with no intervening linebreak); when run via #2, there's (correctly) no output:

    We lost a linebreak after 8977 good lines.
    We lost a linebreak after 1468 good lines.
    We lost a linebreak after 20985 good lines.
    We lost a linebreak after 6982 good lines.
    We lost a linebreak after 1150 good lines.
    We lost a linebreak after 276 good lines.
    We lost a linebreak after 12076 good lines.
    We lost a linebreak after 2072 good lines.
    We lost a linebreak after 4576 good lines.
    We lost a linebreak after 401 good lines.
    We lost a linebreak after 6428 good lines.
    We lost a linebreak after 7228 good lines.
    We lost a linebreak after 931 good lines.
    We lost a linebreak after 1240 good lines.
    We lost a linebreak after 2432 good lines.
    We lost a linebreak after 553 good lines.
    We lost a linebreak after 6550 good lines.
    We lost a linebreak after 1591 good lines.
    We lost a linebreak after 55 good lines.
    We lost a linebreak after 2428 good lines.
    We lost a linebreak after 1475 good lines.
    We lost a linebreak after 3866 good lines.
    We lost a linebreak after 3000 good lines.
    

    解决方案

    This turns out to be a known issue:

    The bug is in fact in the lower-level _read function, which the stdio library functions (including both fread and fgets) use to read from a file descriptor.

    The bug in _read is as follows: If…

    1. you are reading from a text mode pipe,
    2. you call _read to read N bytes,
    3. _read successfully reads N bytes, and
    4. the last byte read is a carriage return (CR) character,

    then the _read function will complete the read successfully but will return N-1 instead of N. The CR or LF character at the end of the result buffer is not counted in the return value.

    In the specific issue reported in this bug, fread calls _read to fill the stream buffer. _read reports that it filled N-1 bytes of the buffer and the final CR or LF character is lost.

    The bug is fundamentally timing-sensitive because whether _read can successfully read N bytes from the pipe depends on how much data has been written to the pipe. Changing the buffer size or changing when the buffer is flushed may reduce the likelihood of the problem, but it won’t necessarily work around the problem in 100% of cases.

    There are several possible workarounds:

    1. use a binary pipe and do text mode CRLF => LF translation manually on the reader side. This is not particularly difficult to do (scan the buffer for CRLF pairs; replace them with a single LF).
    2. call ReadFile with _osfhnd(fh), bypassing the CRT’s I/O library on the reader side entirely (though this would also require manual text mode translation, since the OS won’t do text mode translation for you)

    We have fixed this bug for the next update to the Universal CRT. Note that the Universal CRT is an operating system component and is serviced independently from the Visual C++ libraries. The next update to the Universal CRT will probably be around the same timeframe as the Windows 10 Anniversary Update this summer.

    这篇关于为什么重定向在管道失效的地方工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆