由于未初始化的数据指针,管道进入 SET/P 失败? [英] Piping into SET /P fails due to uninitialised data pointer?
问题描述
假设我们有一个文本文件sample.txt
:
一二...
现在我们要删除第一行:
<块引用>两个...
一个快速的方法是使用输入重定向,set/P
和 findstr
1(我知道还有其他方法使用more
或 for/F
,但让我们暂时忘记它们):
@echo off<样本.txt"(设置/P=""findstr "^")
输出将符合预期.
但是,为什么当我将输入重定向 <
替换为 type
和管道 |
时输出为空:
@echo off输入sample.txt"|(设置/P=""findstr "^")
当我用 pause > 替换
,输出是我所期望的——输入文件是输出但第一行的第一个字符丢失(因为它被 set/P =""
时nulpause
消耗).但是为什么 set/P
似乎消耗了所有内容,而不是像重定向 <
方法那样只消耗第一行?这是一个错误吗?
在我看来,set/P
无法充分初始化指向管道数据的读取指针.
我在 Windows 7 和 Windows 10 上看到了这种奇怪的行为.
<小时>它变得更加奇怪:当多次调用包含管道的脚本时,例如通过像 for/L %I in (1,1,1000) do @pipe.bat
这样的循环,并且输入文件包含大约十五行或更多行,有时(千分之几)会返回输入文件的片段;该片段每次都完全相同;开头好像总少80个字节.
<子>1) findstr
挂起以防最后一行没有被换行符终止,所以让我们假设这样的存在.
在检索数据时,set/p
尝试用来自 stdin 的数据填充 1023 个字符的缓冲区(如果它们可用).读取操作结束后,将搜索行首,一旦找到(或已到达缓冲区的末尾),将调用 SetFilePointer
API 以重新定位输入流指针在读取行结束之后.这样下一个读操作就会在读行之后开始检索数据.
当磁盘文件与输入流关联时,这可以完美运行,但正如 Microsoft 在 SetFilePointer
文档
hFile 参数必须指向存储在搜索设备上的文件;例如,磁盘卷.调用 SetFilePointer 函数处理非搜索设备,例如管道或通信设备不受支持,即使 SetFilePointer 函数可能不返回错误.SetFilePointer 函数在这种情况未定义.
发生的事情是,虽然没有产生任何错误,但当 stdin 与管道关联时,重新定位读取指针的调用失败,指针没有移回,1023 个字节(或可用读取字节数)继续阅读.
编辑以响应 Aacini 请求
set
命令由 eSet
函数处理,它调用 SetWork
来确定哪种类型的 set
命令将被执行.
因为它是一个 set/p
,所以 SetPromptUser
函数被调用,并且从这个函数调用 ReadBufFromInput
函数
添加esp, 0Chlea eax, [ebp+var_80C]推 eax ;整数按 3FFh ;整数lea eax, [ebp+值]推 eax ;整数xor esi, esi推 0FFFFFFF6h ;标准句柄mov word ptr [ebp+Value], si调用 edi ;获取标准句柄(x) ;获取标准句柄(x)推 eax ;文件调用 _ReadBufFromInput@16 ;ReadBufFromInput(x,x,x,x)
它从标准输入句柄请求 3FFh
(1023) 个字符 (0FFFFFFFF6h
= -10
= STD_INPUT_HANDLE
)>
ReadBufFromInput
使用 GetFileType
API 来确定它应该从控制台读取还是从文件读取
;属性:基于bp的帧;int __stdcall ReadBufFromInput(HANDLE hFile, int, int, int)_ReadBufFromInput@16 proc 附近hFile= 双字指针 8;.text 处的功能块:4AD10D3D 大小 00000006 字节mov edi, edi推ebpmov ebp, esp推送 [ebp+hFile] ;文件调用 ds:__imp__GetFileType@4 ;获取文件类型(x)和 eax, 0FFFF7FFFhcmp eax, 2jz loc_4AD10D3D
而且,在这种情况下,它是一个管道 (GetFileType
返回3
)代码跳转到ReadBufFromFile
函数
;属性:基于bp的帧;int __stdcall ReadBufFromFile(HANDLE hFile, LPWSTR lpWideCharStr, DWORD cchWideChar, LPDWORD lpNumberOfBytesRead)_ReadBufFromFile@16 proc 附近var_C= 双字指针 -0ChcchMultiByte = 双字 ptr -8NumberOfBytesRead= dword ptr -4hFile= 双字指针 8lpWideCharStr=双字指针0ChcchWideChar= 双字指针 10hlpNumberOfBytesRead=dword ptr 14h
该函数将调用ReadFile
API 函数来检索指定的字符数
push ebx ;lp重叠推送 [ebp+lpNumberOfBytesRead] ;lpNumberOfBytesReadmov [ebp+var_C], eax推送 [ebp+cchWideChar] ;nNumberOfBytesToRead推送 edi ;缓冲区推送 [ebp+hFile] ;文件调用 ds:__imp__ReadFile@20 ;读取文件(x,x,x,x,x)
迭代返回的缓冲区以寻找行尾,一旦找到,将输入流中的指针移动到找到的位置之后
.text:4AD06A15 loc_4AD06A15:.text:4AD06A15 cmp [ebp+NumberOfBytesRead], 3.text:4AD06A19 jl 短 loc_4AD06A2D.text:4AD06A1B mov al, [esi].text:4AD06A1D cmp al, 0Ah.text:4AD06A1F jz loc_4AD06BCF.text:4AD06A25.text:4AD06A25 loc_4AD06A25:.text:4AD06A25 cmp al, 0Dh.text:4AD06A27 jz loc_4AD06D14.text:4AD06A2D.text:4AD06A2D loc_4AD06A2D:.text:4AD06A2D movzx eax, 字节 ptr [esi].text:4AD06A30 cmp 字节 ptr _DbcsLeadCharTable[eax], bl.text:4AD06A36 jnz loc_4AD12018.text:4AD06A3C dec [ebp+NumberOfBytesRead].text:4AD06A3F inc esi.text:4AD06A40.text:4AD06A40 loc_4AD06A40:.text:4AD06A40 cmp [ebp+NumberOfBytesRead], ebx.text:4AD06A43 jg 短 loc_4AD06A15.text:4AD06BCF loc_4AD06BCF:.text:4AD06BCF cmp 字节 ptr [esi+1], 0Dh.text:4AD06BD3 jnz loc_4AD06A25.text:4AD06BD9 jmp loc_4AD06D1E.text:4AD06D14 loc_4AD06D14:.text:4AD06D14 cmp 字节 ptr [esi+1], 0Ah.text:4AD06D18 jnz loc_4AD06A2D.text:4AD06D1E.text:4AD06D1E loc_4AD06D1E:.text:4AD06D1E mov eax, [ebp+var_C].text:4AD06D21 mov [esi+2], bl.text:4AD06D24 sub esi, edi.text:4AD06D26 inc esi.text:4AD06D27 inc esi.text:4AD06D28 推送 ebx ;移动方法.text:4AD06D29 推送 ebx ;lpDistanceToMoveHigh.text:4AD06D2A mov [ebp+cchMultiByte], esi.text:4AD06D2D 添加 esi, eax.text:4AD06D2F 推 esi ;移动距离.text:4AD06D30 推送 [ebp+hFile] ;文件.text:4AD06D33 调用 ds:__imp__SetFilePointer@16 ;SetFilePointer(x,x,x,x)
Supposing we have got a text file sample.txt
:
one two ...
Now we want to remove the first line:
two ...
A quick way to do that is to use input redirection, set /P
and findstr
1 (I know there are other ways using more
or for /F
, but let us forget about them for now):
@echo off
< "sample.txt" (
set /P =""
findstr "^"
)
The output is going to be as expected.
However, why is the output empty when I replace the input redirection <
by type
and a pipe |
:
@echo off
type "sample.txt" | (
set /P =""
findstr "^"
)
When I replace set /P =""
by pause > nul
, the output is what I expect -- the input file is output but with the first character of the first line missing (as it is consumed by pause
). But why does set /P
seem to consume everything instead of only the first line like it does with the redirection <
approach? Is that a bug?
To me it looks like set /P
fails to adequately initialise the reading pointer to the piped data.
I watched that strange behaviour on Windows 7 and on Windows 10.
It becomes even more weird: when calling the script containing the pipe multiple times, for instance by a loop like for /L %I in (1,1,1000) do @pipe.bat
, and the input file contains about fifteen lines or more, sometimes (a few times out of thousand) a fragment of the input file is returned; that fragment is exactly the same each time; it seems that there are always 80 bytes missing at the beginning.
1) findstr
hangs in case the last line is not terminated by a line-break, so let us assume such is there.
When retrieving data, the set /p
tries to fill a 1023 character buffer (if they are available) with data from stdin. Once this read operation has ended, the first end of line is searched and once it has been found (or the end of the buffer has been reached), the SetFilePointer
API is called to reposition the input stream pointer after the end of the read line. This way the next read operation will start to retreive data after the read line.
This works flawlessly when a disk file is associated with the input stream, but as Microsoft states in the SetFilePointer
documentation
The hFile parameter must refer to a file stored on a seeking device; for example, a disk volume. Calling the SetFilePointer function with a handle to a non-seeking device such as a pipe or a communications device is not supported, even though the SetFilePointer function may not return an error. The behavior of the SetFilePointer function in this case is undefined.
What is happening is that, while not generating any error, the call to reposition the read pointer fails when stdin is associated with a pipe, the pointer is not moved back and the 1023 bytes (or the number of available read bytes) keep read.
edited in response to Aacini request
The set
command is processed by the eSet
function, who calls SetWork
to determine which type of set
command will be executed.
As it is a set /p
the SetPromptUser
function is called and from this function the ReadBufFromInput
function is called
add esp, 0Ch
lea eax, [ebp+var_80C]
push eax ; int
push 3FFh ; int
lea eax, [ebp+Value]
push eax ; int
xor esi, esi
push 0FFFFFFF6h ; nStdHandle
mov word ptr [ebp+Value], si
call edi ; GetStdHandle(x) ; GetStdHandle(x)
push eax ; hFile
call _ReadBufFromInput@16 ; ReadBufFromInput(x,x,x,x)
it requests 3FFh
(1023) characters from standard input handle (0FFFFFFF6h
= -10
= STD_INPUT_HANDLE
)
ReadBufFromInput
uses the GetFileType
API to determine if it should read from the console or from a file
; Attributes: bp-based frame
; int __stdcall ReadBufFromInput(HANDLE hFile, int, int, int)
_ReadBufFromInput@16 proc near
hFile= dword ptr 8
; FUNCTION CHUNK AT .text:4AD10D3D SIZE 00000006 BYTES
mov edi, edi
push ebp
mov ebp, esp
push [ebp+hFile] ; hFile
call ds:__imp__GetFileType@4 ; GetFileType(x)
and eax, 0FFFF7FFFh
cmp eax, 2
jz loc_4AD10D3D
and, as in this case it is a pipe (GetFileType
returns 3
) the code jumps to the ReadBufFromFile
function
; Attributes: bp-based frame
; int __stdcall ReadBufFromFile(HANDLE hFile, LPWSTR lpWideCharStr, DWORD cchWideChar, LPDWORD lpNumberOfBytesRead)
_ReadBufFromFile@16 proc near
var_C= dword ptr -0Ch
cchMultiByte= dword ptr -8
NumberOfBytesRead= dword ptr -4
hFile= dword ptr 8
lpWideCharStr= dword ptr 0Ch
cchWideChar= dword ptr 10h
lpNumberOfBytesRead= dword ptr 14h
This function will call the ReadFile
API function to retrive the indicated number of characters
push ebx ; lpOverlapped
push [ebp+lpNumberOfBytesRead] ; lpNumberOfBytesRead
mov [ebp+var_C], eax
push [ebp+cchWideChar] ; nNumberOfBytesToRead
push edi ; lpBuffer
push [ebp+hFile] ; hFile
call ds:__imp__ReadFile@20 ; ReadFile(x,x,x,x,x)
The returned buffer is iterated in search of an end of line, and once it is found, the pointer in the input stream is moved after the found poisition
.text:4AD06A15 loc_4AD06A15:
.text:4AD06A15 cmp [ebp+NumberOfBytesRead], 3
.text:4AD06A19 jl short loc_4AD06A2D
.text:4AD06A1B mov al, [esi]
.text:4AD06A1D cmp al, 0Ah
.text:4AD06A1F jz loc_4AD06BCF
.text:4AD06A25
.text:4AD06A25 loc_4AD06A25:
.text:4AD06A25 cmp al, 0Dh
.text:4AD06A27 jz loc_4AD06D14
.text:4AD06A2D
.text:4AD06A2D loc_4AD06A2D:
.text:4AD06A2D movzx eax, byte ptr [esi]
.text:4AD06A30 cmp byte ptr _DbcsLeadCharTable[eax], bl
.text:4AD06A36 jnz loc_4AD12018
.text:4AD06A3C dec [ebp+NumberOfBytesRead]
.text:4AD06A3F inc esi
.text:4AD06A40
.text:4AD06A40 loc_4AD06A40:
.text:4AD06A40 cmp [ebp+NumberOfBytesRead], ebx
.text:4AD06A43 jg short loc_4AD06A15
.text:4AD06BCF loc_4AD06BCF:
.text:4AD06BCF cmp byte ptr [esi+1], 0Dh
.text:4AD06BD3 jnz loc_4AD06A25
.text:4AD06BD9 jmp loc_4AD06D1E
.text:4AD06D14 loc_4AD06D14:
.text:4AD06D14 cmp byte ptr [esi+1], 0Ah
.text:4AD06D18 jnz loc_4AD06A2D
.text:4AD06D1E
.text:4AD06D1E loc_4AD06D1E:
.text:4AD06D1E mov eax, [ebp+var_C]
.text:4AD06D21 mov [esi+2], bl
.text:4AD06D24 sub esi, edi
.text:4AD06D26 inc esi
.text:4AD06D27 inc esi
.text:4AD06D28 push ebx ; dwMoveMethod
.text:4AD06D29 push ebx ; lpDistanceToMoveHigh
.text:4AD06D2A mov [ebp+cchMultiByte], esi
.text:4AD06D2D add esi, eax
.text:4AD06D2F push esi ; lDistanceToMove
.text:4AD06D30 push [ebp+hFile] ; hFile
.text:4AD06D33 call ds:__imp__SetFilePointer@16 ; SetFilePointer(x,x,x,x)
这篇关于由于未初始化的数据指针,管道进入 SET/P 失败?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!