由于未初始化的数据指针,管道进入 SET/P 失败? [英] Piping into SET /P fails due to uninitialised data pointer?

查看:16
本文介绍了由于未初始化的数据指针,管道进入 SET/P 失败?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我们有一个文本文件sample.txt:

<块引用>

一二...

现在我们要删除第一行:

<块引用>

两个...

一个快速的方法是使用输入重定向,set/Pfindstr1(我知道还有其他方法使用morefor/F,但让我们暂时忘记它们):

@echo off<样本.txt"(设置/P=""findstr "^")

输出将符合预期.

但是,为什么当我将输入重定向 < 替换为 type 和管道 | 时输出为空:

@echo off输入sample.txt"|(设置/P=""findstr "^")

当我用 pause > 替换 set/P ="" 时nul,输出是我所期望的——输入文件是输出但第一行的第一个字符丢失(因为它被 pause 消耗).但是为什么 set/P 似乎消耗了所有内容,而不是像重定向 < 方法那样只消耗第一行?这是一个错误吗?

在我看来,set/P 无法充分初始化指向管道数据的读取指针.

我在 Windows 7 和 Windows 10 上看到了这种奇怪的行为.

<小时>

它变得更加奇怪:当多次调用包含管道的脚本时,例如通过像 for/L %I in (1,1,1000) do @pipe.bat 这样的循环,并且输入文件包含大约十五行或更多行,有时(千分之几)会返回输入文件的片段;该片段每次都完全相同;开头好像总少80个字节.

<小时>

<子>1) findstr 挂起以防最后一行没有被换行符终止,所以让我们假设这样的存在.

解决方案

在检索数据时,set/p 尝试用来自 stdin 的数据填充 1023 个字符的缓冲区(如果它们可用).读取操作结束后,将搜索行首,一旦找到(或已到达缓冲区的末尾),将调用 SetFilePointer API 以重新定位输入流指针在读取行结束之后.这样下一个读操作就会在读行之后开始检索数据.

当磁盘文件与输入流关联时,这可以完美运行,但正如 Microsoft 在 SetFilePointer 文档

<块引用>

hFile 参数必须指向存储在搜索设备上的文件;例如,磁盘卷.调用 SetFilePointer 函数处理非搜索设备,例如管道或通信设备不受支持,即使 SetFilePointer 函数可能不返回错误.SetFilePointer 函数在这种情况未定义.

发生的事情是,虽然没有产生任何错误,但当 stdin 与管道关联时,重新定位读取指针的调用失败,指针没有移回,1023 个字节(或可用读取字节数)继续阅读.

编辑以响应 Aacini 请求

set 命令由 eSet 函数处理,它调用 SetWork 来确定哪种类型的 set命令将被执行.

因为它是一个 set/p,所以 SetPromptUser 函数被调用,并且从这个函数调用 ReadBufFromInput 函数

添加esp, 0Chlea eax, [ebp+var_80C]推 eax ;整数按 3FFh ;整数lea eax, [ebp+值]推 eax ;整数xor esi, esi推 0FFFFFFF6h ;标准句柄mov word ptr [ebp+Value], si调用 edi ;获取标准句柄(x) ;获取标准句柄(x)推 eax ;文件调用 _ReadBufFromInput@16 ;ReadBufFromInput(x,x,x,x)

它从标准输入句柄请求 3FFh (1023) 个字符 (0FFFFFFFF6h = -10 = STD_INPUT_HANDLE)

ReadBufFromInput 使用 GetFileType API 来确定它应该从控制台读取还是从文件读取

;属性:基于bp的帧;int __stdcall ReadBufFromInput(HANDLE hFile, int, int, int)_ReadBufFromInput@16 proc 附近hFile= 双字指针 8;.text 处的功能块:4AD10D3D 大小 00000006 字节mov edi, edi推ebpmov ebp, esp推送 [ebp+hFile] ;文件调用 ds:__imp__GetFileType@4 ;获取文件类型(x)和 eax, 0FFFF7FFFhcmp eax, 2jz loc_4AD10D3D

而且,在这种情况下,它是一个管道 (GetFileType 返回3)代码跳转到ReadBufFromFile函数

;属性:基于bp的帧;int __stdcall ReadBufFromFile(HANDLE hFile, LPWSTR lpWideCharStr, DWORD cchWideChar, LPDWORD lpNumberOfBytesRead)_ReadBufFromFile@16 proc 附近var_C= 双字指针 -0ChcchMultiByte = 双字 ptr -8NumberOfBytesRead= dword ptr -4hFile= 双字指针 8lpWideCharStr=双字指针0ChcchWideChar= 双字指针 10hlpNumberOfBytesRead=dword ptr 14h

该函数将调用ReadFile API 函数来检索指定的字符数

push ebx ;lp重叠推送 [ebp+lpNumberOfBytesRead] ;lpNumberOfBytesReadmov [ebp+var_C], eax推送 [ebp+cchWideChar] ;nNumberOfBytesToRead推送 edi ;缓冲区推送 [ebp+hFile] ;文件调用 ds:__imp__ReadFile@20 ;读取文件(x,x,x,x,x)

迭代返回的缓冲区以寻找行尾,一旦找到,将输入流中的指针移动到找到的位置之后

.text:4AD06A15 loc_4AD06A15:.text:4AD06A15 cmp [ebp+NumberOfBytesRead], 3.text:4AD06A19 jl 短 loc_4AD06A2D.text:4AD06A1B mov al, [esi].text:4AD06A1D cmp al, 0Ah.text:4AD06A1F jz loc_4AD06BCF.text:4AD06A25.text:4AD06A25 loc_4AD06A25:.text:4AD06A25 cmp al, 0Dh.text:4AD06A27 jz loc_4AD06D14.text:4AD06A2D.text:4AD06A2D loc_4AD06A2D:.text:4AD06A2D movzx eax, 字节 ptr [esi].text:4AD06A30 cmp 字节 ptr _DbcsLeadCharTable[eax], bl.text:4AD06A36 jnz loc_4AD12018.text:4AD06A3C dec [ebp+NumberOfBytesRead].text:4AD06A3F inc esi.text:4AD06A40.text:4AD06A40 loc_4AD06A40:.text:4AD06A40 cmp [ebp+NumberOfBytesRead], ebx.text:4AD06A43 jg 短 loc_4AD06A15.text:4AD06BCF loc_4AD06BCF:.text:4AD06BCF cmp 字节 ptr [esi+1], 0Dh.text:4AD06BD3 jnz loc_4AD06A25.text:4AD06BD9 jmp loc_4AD06D1E.text:4AD06D14 loc_4AD06D14:.text:4AD06D14 cmp 字节 ptr [esi+1], 0Ah.text:4AD06D18 jnz loc_4AD06A2D.text:4AD06D1E.text:4AD06D1E loc_4AD06D1E:.text:4AD06D1E mov eax, [ebp+var_C].text:4AD06D21 mov [esi+2], bl.text:4AD06D24 sub esi, edi.text:4AD06D26 inc esi.text:4AD06D27 inc esi.text:4AD06D28 推送 ebx ;移动方法.text:4AD06D29 推送 ebx ;lpDistanceToMoveHigh.text:4AD06D2A mov [ebp+cchMultiByte], esi.text:4AD06D2D 添加 esi, eax.text:4AD06D2F 推 esi ;移动距离.text:4AD06D30 推送 [ebp+hFile] ;文件.text:4AD06D33 调用 ds:__imp__SetFilePointer@16 ;SetFilePointer(x,x,x,x)

Supposing we have got a text file sample.txt:

one
two
...

Now we want to remove the first line:

two
...

A quick way to do that is to use input redirection, set /P and findstr1 (I know there are other ways using more or for /F, but let us forget about them for now):

@echo off
< "sample.txt" (
    set /P =""
    findstr "^"
)

The output is going to be as expected.

However, why is the output empty when I replace the input redirection < by type and a pipe | :

@echo off
type "sample.txt" | (
    set /P =""
    findstr "^"
)

When I replace set /P ="" by pause > nul, the output is what I expect -- the input file is output but with the first character of the first line missing (as it is consumed by pause). But why does set /P seem to consume everything instead of only the first line like it does with the redirection < approach? Is that a bug?

To me it looks like set /P fails to adequately initialise the reading pointer to the piped data.

I watched that strange behaviour on Windows 7 and on Windows 10.


It becomes even more weird: when calling the script containing the pipe multiple times, for instance by a loop like for /L %I in (1,1,1000) do @pipe.bat, and the input file contains about fifteen lines or more, sometimes (a few times out of thousand) a fragment of the input file is returned; that fragment is exactly the same each time; it seems that there are always 80 bytes missing at the beginning.


1) findstr hangs in case the last line is not terminated by a line-break, so let us assume such is there.

解决方案

When retrieving data, the set /p tries to fill a 1023 character buffer (if they are available) with data from stdin. Once this read operation has ended, the first end of line is searched and once it has been found (or the end of the buffer has been reached), the SetFilePointer API is called to reposition the input stream pointer after the end of the read line. This way the next read operation will start to retreive data after the read line.

This works flawlessly when a disk file is associated with the input stream, but as Microsoft states in the SetFilePointer documentation

The hFile parameter must refer to a file stored on a seeking device; for example, a disk volume. Calling the SetFilePointer function with a handle to a non-seeking device such as a pipe or a communications device is not supported, even though the SetFilePointer function may not return an error. The behavior of the SetFilePointer function in this case is undefined.

What is happening is that, while not generating any error, the call to reposition the read pointer fails when stdin is associated with a pipe, the pointer is not moved back and the 1023 bytes (or the number of available read bytes) keep read.

edited in response to Aacini request

The set command is processed by the eSet function, who calls SetWork to determine which type of set command will be executed.

As it is a set /p the SetPromptUser function is called and from this function the ReadBufFromInput function is called

add     esp, 0Ch
lea     eax, [ebp+var_80C]
push    eax             ; int
push    3FFh            ; int
lea     eax, [ebp+Value]
push    eax             ; int
xor     esi, esi
push    0FFFFFFF6h      ; nStdHandle
mov     word ptr [ebp+Value], si
call    edi ; GetStdHandle(x) ; GetStdHandle(x)
push    eax             ; hFile
call    _ReadBufFromInput@16 ; ReadBufFromInput(x,x,x,x)

it requests 3FFh (1023) characters from standard input handle (0FFFFFFF6h = -10 = STD_INPUT_HANDLE)

ReadBufFromInput uses the GetFileType API to determine if it should read from the console or from a file

; Attributes: bp-based frame

; int __stdcall ReadBufFromInput(HANDLE hFile, int, int, int)
_ReadBufFromInput@16 proc near

hFile= dword ptr  8

; FUNCTION CHUNK AT .text:4AD10D3D SIZE 00000006 BYTES

mov     edi, edi
push    ebp
mov     ebp, esp
push    [ebp+hFile]     ; hFile
call    ds:__imp__GetFileType@4 ; GetFileType(x)
and     eax, 0FFFF7FFFh
cmp     eax, 2
jz      loc_4AD10D3D

and, as in this case it is a pipe (GetFileType returns 3) the code jumps to the ReadBufFromFile function

; Attributes: bp-based frame

; int __stdcall ReadBufFromFile(HANDLE hFile, LPWSTR lpWideCharStr, DWORD cchWideChar, LPDWORD lpNumberOfBytesRead)
_ReadBufFromFile@16 proc near

var_C= dword ptr -0Ch
cchMultiByte= dword ptr -8
NumberOfBytesRead= dword ptr -4
hFile= dword ptr  8
lpWideCharStr= dword ptr  0Ch
cchWideChar= dword ptr  10h
lpNumberOfBytesRead= dword ptr  14h

This function will call the ReadFile API function to retrive the indicated number of characters

push    ebx             ; lpOverlapped
push    [ebp+lpNumberOfBytesRead] ; lpNumberOfBytesRead
mov     [ebp+var_C], eax
push    [ebp+cchWideChar] ; nNumberOfBytesToRead
push    edi             ; lpBuffer
push    [ebp+hFile]     ; hFile
call    ds:__imp__ReadFile@20 ; ReadFile(x,x,x,x,x)

The returned buffer is iterated in search of an end of line, and once it is found, the pointer in the input stream is moved after the found poisition

.text:4AD06A15 loc_4AD06A15:                           
.text:4AD06A15                 cmp     [ebp+NumberOfBytesRead], 3
.text:4AD06A19                 jl      short loc_4AD06A2D
.text:4AD06A1B                 mov     al, [esi]
.text:4AD06A1D                 cmp     al, 0Ah
.text:4AD06A1F                 jz      loc_4AD06BCF
.text:4AD06A25
.text:4AD06A25 loc_4AD06A25:                           
.text:4AD06A25                 cmp     al, 0Dh
.text:4AD06A27                 jz      loc_4AD06D14
.text:4AD06A2D
.text:4AD06A2D loc_4AD06A2D:                           
.text:4AD06A2D                 movzx   eax, byte ptr [esi]
.text:4AD06A30                 cmp     byte ptr _DbcsLeadCharTable[eax], bl
.text:4AD06A36                 jnz     loc_4AD12018
.text:4AD06A3C                 dec     [ebp+NumberOfBytesRead]
.text:4AD06A3F                 inc     esi
.text:4AD06A40
.text:4AD06A40 loc_4AD06A40:                           
.text:4AD06A40                 cmp     [ebp+NumberOfBytesRead], ebx
.text:4AD06A43                 jg      short loc_4AD06A15

.text:4AD06BCF loc_4AD06BCF:                          
.text:4AD06BCF                 cmp     byte ptr [esi+1], 0Dh
.text:4AD06BD3                 jnz     loc_4AD06A25
.text:4AD06BD9                 jmp     loc_4AD06D1E

.text:4AD06D14 loc_4AD06D14:                           
.text:4AD06D14                 cmp     byte ptr [esi+1], 0Ah
.text:4AD06D18                 jnz     loc_4AD06A2D
.text:4AD06D1E
.text:4AD06D1E loc_4AD06D1E:                          
.text:4AD06D1E                 mov     eax, [ebp+var_C]
.text:4AD06D21                 mov     [esi+2], bl
.text:4AD06D24                 sub     esi, edi
.text:4AD06D26                 inc     esi
.text:4AD06D27                 inc     esi
.text:4AD06D28                 push    ebx             ; dwMoveMethod
.text:4AD06D29                 push    ebx             ; lpDistanceToMoveHigh
.text:4AD06D2A                 mov     [ebp+cchMultiByte], esi
.text:4AD06D2D                 add     esi, eax
.text:4AD06D2F                 push    esi             ; lDistanceToMove
.text:4AD06D30                 push    [ebp+hFile]     ; hFile
.text:4AD06D33                 call    ds:__imp__SetFilePointer@16 ; SetFilePointer(x,x,x,x)

这篇关于由于未初始化的数据指针,管道进入 SET/P 失败?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆