Powershell、文件系统提供程序、Get-ChildItem 过滤……官方文档在哪里? [英] Powershell, File system provider, Get-ChildItem filtering... where are the official docs?
问题描述
如另一个问题中所述,如果如果您尝试执行 Get-ChildItem -filter ...
命令,与使用 -include
而不是 -filter
相比,您会受到更多限制.我想阅读文件系统提供商过滤语法的官方文档,但经过半小时的搜索,我仍然没有找到它们.有谁知道在哪里看?
As mentioned in another question, if you try to do a Get-ChildItem -filter ...
command you are more limited than if you used -include
instead of -filter
. I'd like to read the official docs for the file system provider's filtering syntax but after a half hour of searching I still haven't found them. Anyone know where to look?
推荐答案
tl;dr -Filter
使用 .NET 的 FsRtllsNameInExpression
实现,即
tl;dr -Filter
uses .NET's implementation of FsRtllsNameInExpression
, which is documented on MSDN along with basic pattern matching info. The algorithm is unintuitive for compatibility reasons, and you should probably avoid using this feature. Additionally, .NET has numerous bugs in its implementation.
-Filter
不使用 PowerShell 提供的过滤系统——也就是说,它不使用 Get-Help about_Wildcard<描述的过滤系统/代码>.相反,它将过滤器传递给 Windows API.因此,过滤的工作方式与在使用 Windows API 的任何其他程序(例如
cmd.exe
)中的工作方式相同.
-Filter
does not use the filtering system provided by PowerShell--that is, it does not use the filtering system described by Get-Help about_Wildcard
. Rather, it passes the filter to the Windows API. Therefore, the filtering works the same as it does in any other program that utilizes the Windows API, such as cmd.exe
.
相反,PowerShell 使用 FsRtlIsNameInExpression
类似算法,用于 -Filter 模式匹配.该算法基于旧的 MS-DOS 行为,因此充满了为遗留目的而保留的警告.通常说它具有三个常见的特殊字符.确切的行为很复杂,但大致如下:
Instead, PowerShell uses a FsRtlIsNameInExpression
-like algorithm for -Filter pattern matching. The algorithm based on old MS-DOS behavior, so it's riddled with caveats that are preserved for legacy purposes. It's typically said to have three common special characters. The exact behavior is complex, but it's more or less like the following:
*
:匹配任意数量的字符(包括零)?
:只匹配一个字符,不包括名称中的最后一个句点.
:如果是模式中的最后一个句点,则锚定到文件名中的最后一个句点,如果没有句点,则锚定到文件名的末尾;也可以匹配文字句点
*
: Matches any number of characters (zero-inclusive)?
: Matches exactly one character, excluding the last period in a name.
: If the last period in a pattern, anchors to the last period in the filename, or the end of the filename if it doesn't have a period; can also match a literal period
为了让事情变得更复杂,Windows 添加了三个额外的特殊字符,它们的行为与旧的 MS-DOS 特殊字符完全相同.原来的特殊字符现在的行为略有不同,以适应更灵活的文件系统.
Just to make things more complicated, Windows added three additional special characters that behave exactly the same as the old MS-DOS special characters. The original special characters have slightly different behavior now to account for more flexible filesystems.
"
等价于 MS-DOS.
(ntifs.h 中的DOS_DOT
和ANSI_DOS_DOT
)立><
等价于 MS-DOS?
(ntifs.h 中的DOS_QM
和ANSI_DOS_QM
)>
等价于 MS-DOS*
(ntifs.h 中的DOS_STAR
和ANSI_DOS_STAR
)
"
is equivalent to MS-DOS.
(DOS_DOT
andANSI_DOS_DOT
in ntifs.h)<
is equivalent to MS-DOS?
(DOS_QM
andANSI_DOS_QM
in ntifs.h)>
is equivalent to MS-DOS*
(DOS_STAR
andANSI_DOS_STAR
in ntifs.h)
很多来源似乎颠倒了 <
和 >
.可怕的是,微软在 .NET 实现中混淆了它们,这意味着它们在 PowerShell 中也被颠倒.此外,所有三个兼容性通配符都无法从 -Filter
访问,如 System.IO.Path
错误地将 "<>
视为无效的非通配符.(它允许 .*?
.)这导致了 -Filter 不完整、不稳定和错误的概念.您可以看到 .NET 算法的(错误)实现在 GitHub 上.
Quite a few sources seem to reverse <
and >
. Frighteningly, Microsoft confuses them in their .NET implementation, which means they are also reversed in PowerShell. Additionally, all three compatibility wildcards are inaccissible from -Filter
, as System.IO.Path
mistakenly treats "<>
as invalid, non-wildcard characters. (It allows .*?
.) This contributes to the notion that -Filter is incomplete, unstable, and buggy. You can see .NET's (buggy) implementation of the algorithm on GitHub.
由于算法支持 8.3 兼容性文件名,这也变得更加复杂,也称为短"文件名.(您可能以前见过它们;它们看起来像:SOMETH~1.TXT
)如果文件的完整文件名或其短文件名匹配,则文件与模式匹配.FrankFranchise 在他的回答中提供了有关此警告的更多信息.
This is additionally complicated by the algorithm's support for 8.3 compatibility filenames, otherwise known as "short" filenames. (You've probably seen them before; they look something like: SOMETH~1.TXT
) A file matches the pattern if either its full filename or its short filename match. FrankFranchise has more information about this caveat in his answer.
之前链接的关于 FsRtlIsNameInExpression
的 MSDN 文章有关于 Windows 文件名模式匹配的最新文档,但它并不是特别冗长.有关匹配过去如何在 MS-DOS 上工作以及这如何影响现代匹配的更详尽说明,请参阅 这篇 MSDN 博客文章 是我找到的最好的来源.这是基本思想:
The previously-linked MSDN article on FsRtlIsNameInExpression
has the most up-to-date documentation on Windows filename pattern matching, but it's not particularly verbose. For a more thorough explanation of how matching used to work on MS-DOS and how this affects modern matching, this MSDN blog article is the best source I've found. Here's the basic idea:
- 每个文件名正好是 11 个字节.
- 前 8 个字节存储文件名的主体,用空格填充
- 最后 3 个字节存储扩展名,右填充空格
转换看起来像这样:
11 User 12345678901 ------------ ----------- ABC.TXT > ABC TXT WILDCARD.TXT > WILDCARDTXT ABC.??? > ABC ??? *.* > ??????????? *. > ???????? ABC. > ABC
将其推断为与现代文件系统一起使用充其量只是一个不直观的过程.以如下目录为例:
Extrapolating this to work with modern-day filesystems is an unintuitive process at best. For example, take a directory such as the following:
Name Compat Name ----------------------------------------------- Apple1.txt APPLE1 .TXT Banana BANANA . Something.txt SOMETH~1.TXT SomethingElse.txt SOMETH~2.TXT TXT.exe TXT .EXE TXT.eexe TXT~1 .EEX Wildcard.txt WILDCARD.TXT
我在 Windows 10 上对这些通配符进行了大量测试,结果非常不一致,尤其是
DOS_DOT
("
).如果您从在您自己的命令提示符下,您可能需要对它们进行转义(例如,cmd.exe 中的dir ^>^"^>
以模拟 MS-DOS*.*
).I've done quite a bit of testing of these wildcards on Windows 10 and have gotten very inconsistent results, especially
DOS_DOT
("
). If you test these from on your own from the command prompt, you'll likely need to escape them (e.g.,dir ^>^"^>
in cmd.exe to emulate MS-DOS*.*
).*.* (everything) <"< (everything) * (everything) < Banana . (everything) " (everything) *. Banana <" Banana *g.txt Something.txt <g.txt Something.txt <g"txt (nothing) *1.txt Apple1.txt, Something.txt <1.txt Apple1.txt, Something.txt <1"txt (nothing) *xe TXT.eexe, TXT.exe <xe (nothing) *exe TXT.eexe, TXT.exe <exe TXT.exe ??????.??? Apple1.txt, Asdf.tx, Banana, TXT.eexe, TXT.exe >>>>>>.>>> Apple1.txt, Asdf.tx, TXT.eexe, TXT.exe >>>>>>">>> Banana ????????.??? (everything) >>>>>>>>.>>> (everything except Banana) >>>>>>>>">>> Banana ???????????.??? (everything) >>>>>>>>>>>.>>> (everything except Banana) >>>>>>>>>>>">>> Banana ?????? Banana >>>>>> Banana ??????????? Banana >>>>>>>>>>> Banana ???????????? Banana ???? (nothing) >>>> (nothing) Banana??. Banana Banana>>. Banana Banana>>" Banana Banana????. Banana Banana>>>>. Banana Banana>>>>" Banana Banana. Banana Banana" Banana *txt Apple1.txt, Something.txt, SomethingElse.txt, Wildcard.txt <txt Apple1.txt, Something.txt, SomethingElse.txt, Wildcard.txt *t Apple1.txt, Something.txt, SomethingElse.txt, Wildcard.txt <t (nothing) *txt* Apple1.txt, Something.txt, SomethingElse.txt, TXT.eexe, TXT.exe, Wildcard.txt <txt< Apple1.txt, Something.txt, SomethingElse.txt, Wildcard.txt *txt< Apple1.txt, Something.txt, SomethingElse.txt, Wildcard.txt <txt* Apple1.txt, Something.txt, SomethingElse.txt, TXT.eexe, TXT.exe, Wildcard.txt
注意:在撰写本文时,WINE 的匹配算法在测试这些问题"时会产生明显不同的结果.使用 WINE 1.9.6 测试.
如您所见,向后兼容的 MS-DOS 通配符是晦涩难懂的.即使是微软也至少有一次错误地实现了它们,目前尚不清楚它们在 Windows 中的行为是故意的.
"
的行为似乎完全是随机的,我预计最后两个测试的结果会被交换.As you can see, the backwards-compatible MS-DOS wildcards are obscure and buggy. Even Microsoft has implemented them incorrectly at least once, and it's unclear whether their current behavior in Windows is intentional. The behavior of
"
seems completely random, and I expected the results of the last two tests to be swapped.这篇关于Powershell、文件系统提供程序、Get-ChildItem 过滤……官方文档在哪里?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!