Powershell的Out-File在文件顶部添加换行符-Out-File与Set-Content [英] Powershell's Out-File adds a newline to the Top of the file - Out-File vs. Set-Content

查看:205
本文介绍了Powershell的Out-File在文件顶部添加换行符-Out-File与Set-Content的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我具有以下Powershell:

# Find all .csproj files 
$csProjFiles = get-childitem ./ -include *.csproj -recurse 

# Remove the packages.config include from the csproj files.
$csProjFiles | foreach ($_) {(get-content $_) | 
             select-string -pattern '<None Include="packages.config" />' -notmatch | 
             Out-File $_ -force}

它似乎工作正常.运行后,packages.config中的行不在文件中.

但是在我运行之后,该文件的 TOP 处还有一个换行符. (不是底部.)

我对如何到达那里感到困惑. 我该如何消除文件顶部生成的多余换行符?

更新:

我换成了另一种方式:

$csProjFiles | foreach ($_) {$currentFile = $_; (get-content $_) | 
               Where-Object {$_ -notmatch '<None Include="packages.config" />'} | 
               Set-Content $currentFile -force}

它工作正常,并且文件顶部没有多余的行.但是我不介意知道为什么最上面的示例要添加额外的行.

解决方案

  • Out-File 和重定向运算符>/>> 接受任意输入对象,并将它们转换为字符串表示形式,就像在控制台中出现的那样-即, 使用PowerShell的默认输出格式 -将这些字符串表示形式发送到输出文件.
    这些字符串表示通常具有可读性 的开头和/或结尾的换行符.

    • 请参见 了解更多信息.
  • Set-Content 用于输入对象 已经是字符串应被视为字符串 .

    • PowerShell在所有输入对象上调用.psobject.ToString()以获取字符串表示形式,该字符串表示形式在大多数情况下均遵循基础.NET类型的.ToString()方法.

生成的表示形式通常不相同,知道何时选择哪个cmdlet/运算符很重要.

此外,默认字符编码不同:

  • Out-File>/>>默认为UTF-16 LE ,PowerShell在可选的-Encoding参数的上下文中调用Unicode.
  • Set-Content默认为系统的旧版"ANSI"代码页(单字节扩展ASCII代码页),PowerShell将其称为Default.

    • 请注意,文档自PSv5.1起误认为默认值为 ASCII . [1]

更改编码:

  • 临时更改:使用 -Encoding参数Out-FileSet-Content来控制输出字符编码.
    您无法更改>/>> ad 使用的编码,但请参见下文.

  • [PSv3 +] 更改默认 (请谨慎使用):使用$PSDefaultParameterValues机制(请参见 Get-Help about_Parameters_DefaultValues ),它可以设置参数的默认值:

    • 更改Out-File的默认编码也会在PSv5.1或更高版本中更改>/>> [2 ] .
      例如,要将其更改为UTF-8,请使用:
      $PSDefaultParameterValues['Out-File:Encoding']='UTF8'

      • 请注意,在PSv5.0或更低版本中,您无法更改>>>使用的编码方式.
    • 如果您更改Set-Content的默认设置,请确保也将其更改为Add-Content :
      $PSDefaultParameterValues['Set-Content:Encoding'] = $PSDefaultParameterValues['Add-Content:Encoding'] ='UTF8'

    • 您还可以使用通配符模式来表示cmdlet/高级功能名称,以将默认参数值应用于该模式;例如,如果您使用$PSDefaultParameterValues['*:Encoding']='UTF8',则具有-Encoding参数的 all cmdlet会默认使用该值,但是这是不明智的,因为在某些cmdlet中,-Encoding是指 input 编码.

    • 在写入文件的cmdlet中没有单个共享前缀,可以让您定位所有输出 cmdlet ,但是您可以为每个动词定义一个模式:
      $enc = 'UTF8; $PSDefaultParameterValues += @{ 'Out-*:Encoding'=$enc; 'Set-*:Encoding'=$enc; 'Add-*:Encoding'=$enc; 'Export-*:Encoding'=$enc }

    • 注意事项:$PSDefaultParameterValues是在 global 范围内定义的,因此您对其所做的任何修改都会在 global 范围内生效,并影响后续命令.
      要限制对脚本/函数的作用域及其后代作用域的更改,请使用 local $PSDefaultParameterValues变量,您可以将其初始化为 empty 哈希表以从头开始( $PSDefaultParameterValues = @{}),或初始化为全局值($PSDefaultParameterValues = $PSDefaultParameterValues.Clone())

    • clone

在当前情况下,输出对象是Select-String输出的[Microsoft.PowerShell.Commands.MatchInfo]实例:

  • 使用默认格式,如Out-File一样,它们在上方输出一个空行,在下方输出两个空行(多个实例在一组单行之间的连续块中打印上方和下方的空行).

  • 如果像对Set-File那样在它们上调用.psobject.ToString(),则它们的计算结果只是匹配的行(假设输入是通过管道提供的,则没有起始路径前缀,而不是通过-Path/-LiteralPath 参数)作为文件名,没有前导或尾随的空行.

也就是说,如果您通过管道传输到| Select-Object -ExpandProperty Line或只是| ForEach-Object Line以便仅将匹配的行显式输出为 strings ,则Out-FileSet-Content都将产生相同的结果结果(默认编码除外).


PS:LotPing的观察是正确的:您似乎将foreach 声明ForEach-Object cmdlet 混淆了(很遗憾,这也是众所周知的-别名foreach,引起混乱).

ForEach-Object cmdlet 不需要为$_定义一个明确的定义:在您传递给(隐含的-Process)脚本块中,$_是自动的 定义为手边的输入对象.

foreach(ForEach-Object)的($_)自变量实际上被忽略:因为在特殊上下文之外使用时,它的计算结果为$null:自动变量$_.作为管道中的 script块-有效地评估为$null,并且在其周围加上(...)没什么区别,因此您有效地传递了$null,该值被忽略了.


[1]验证ASCII不是不是的默认值,如下所示:'0x{0:x}' -f $('ä' | Set-Content t.txt; $b=[System.IO.File]::ReadAllBytes("$PWD\t.txt")[0]; ri t.txt; $b)在en-US系统(Windows-1252代码点)上产生0xe4. ä(与Unicode代码点一致,但是输出是一个没有BOM的单字节编码文件).
如果显式使用-Encoding ASCII,则将得到0x3f,即 literal ?的代码点,因为这就是使用ASCII转换所有非ASCII字符的原因.到.

[2] PetSerAl找到了 PetSerAl 表示感谢.

I have the following powershell:

# Find all .csproj files 
$csProjFiles = get-childitem ./ -include *.csproj -recurse 

# Remove the packages.config include from the csproj files.
$csProjFiles | foreach ($_) {(get-content $_) | 
             select-string -pattern '<None Include="packages.config" />' -notmatch | 
             Out-File $_ -force}

And it seems to work fine. The line with the packages.config is not in the file after I run.

But after I run there is an extra newline at that TOP of the file. (Not the bottom.)

I am confused as to how that is getting there. What can I do to get rid of the extra newline char that this generates at the top of the file?

UPDATE:

I swapped out to a different way of doing this:

$csProjFiles | foreach ($_) {$currentFile = $_; (get-content $_) | 
               Where-Object {$_ -notmatch '<None Include="packages.config" />'} | 
               Set-Content $currentFile -force}

It works fine and does not have the extra line at the top of the file. But I wouldn't mind knowing why the top example was adding the extra line.

解决方案

  • Out-File and redirection operators > / >> take arbitrary input objects and convert them to string representations as they would present in the console - that is, PowerShell's default output formatting is applied - and sends those string representations to the output file.
    These string representations often have leading and/or trailing newlines for readability.

  • Set-Content is for input objects that are already strings or should be treated as strings.

    • PowerShell calls .psobject.ToString() on all input objects to obtain the string representation, which in most cases defers to the underlying .NET type's .ToString() method.

The resulting representations are typically not the same, and it's important to know when to choose which cmdlet / operator.

Additionally, the default character encodings differ:

  • Out-File and > / >> default to UTF-16 LE, which PowerShell calls Unicode in the context of the optional -Encoding parameter.
  • Set-Content defaults to your system's legacy "ANSI" code page (a single-byte, extended-ASCII code page), which PowerShell calls Default.

    • Note that the the docs as of PSv5.1 mistakenly claim that the default is ASCII.[1]

To change the encoding:

  • Ad-hoc change: Use the -Encoding parameter with Out-File or Set-Content to control the output character encoding explicitly.
    You cannot change the encoding used by > / >> ad-hoc, but see below.

  • [PSv3+] Changing the default (use with caution): Use the $PSDefaultParameterValues mechanism (see Get-Help about_Parameters_DefaultValues), which enables setting default values for parameters:

    • Changing the default encoding for Out-File also changes it for > / >> in PSv5.1 or above[2].
      To change it to UTF-8, for instance, use:
      $PSDefaultParameterValues['Out-File:Encoding']='UTF8'

      • Note that in PSv5.0 or below you cannot change what encoding > and >> use.
    • If you change the default for Set-Content, be sure to change it for Add-Content too:
      $PSDefaultParameterValues['Set-Content:Encoding'] = $PSDefaultParameterValues['Add-Content:Encoding'] ='UTF8'

    • You can also use wildcard patterns to represent the cmdlet / advanced function name to apply the default parameter value to; for instance, if you used $PSDefaultParameterValues['*:Encoding']='UTF8', then all cmdlets that have an -Encoding parameter would default to that value, but that is ill-advised, because in some cmdlets the -Encoding refers to the input encoding.

    • There is no single shared prefix among cmdlets that write to files that allows you to target all output cmdlets, but you can define a pattern for each of the verbs:
      $enc = 'UTF8; $PSDefaultParameterValues += @{ 'Out-*:Encoding'=$enc; 'Set-*:Encoding'=$enc; 'Add-*:Encoding'=$enc; 'Export-*:Encoding'=$enc }

    • Caveat: $PSDefaultParameterValues is defined in the global scope, so any modifications you make to it take effect globally, and affect subsequent commands.
      To limit changes to a script / function's scope and its descendent scopes, use a local $PSDefaultParameterValues variable, which you can either initialize to an empty hashtable to start from scratch ($PSDefaultParameterValues = @{}), or initialize to a clone of the global value ($PSDefaultParameterValues = $PSDefaultParameterValues.Clone())


In the case at hand, the output objects are [Microsoft.PowerShell.Commands.MatchInfo] instances output by Select-String:

  • Using default formatting, as happens with Out-File, they output an empty line above, and two empty lines below (with multiple instances printing in a contiguous block between a single set of the empty lines above and below).

  • If you call .psobject.ToString() on them, as happens with Set-File, they evaluate to just the matching lines (with no origin-path prefix, given that input was provided via the pipeline rather than as filenames via the -Path / -LiteralPath parameters), with no leading or trailing empty lines.

That said, had you piped to | Select-Object -ExpandProperty Line or simply | ForEach-Object Line in order to explicitly output just the matching lines as strings, both Out-File and Set-Content would have yielded the same result (except for their default encoding).


P.S.: LotPing's observation is correct: You seem to be confusing the foreach statement with the ForEach-Object cmdlet (which, regrettably, is also known by built-in alias foreach, causing confusion).

The ForEach-Object cmdlet doesn't need an explicit definition for $_: in the (implied -Process) script block you pass to it, $_ is automatically defined to be the input object at hand.

Your ($_) argument to foreach (ForEach-Object) is effectively ignored: because it evaluates to $null: automatic variable $_, when used outside of special contexts - such as script blocks in the pipeline - effectively evaluates to $null, and putting (...) around it makes no difference, so you're effectively passing $null, which is ignored.


[1] Verify that ASCII is not the default as follows: '0x{0:x}' -f $('ä' | Set-Content t.txt; $b=[System.IO.File]::ReadAllBytes("$PWD\t.txt")[0]; ri t.txt; $b) yields 0xe4 on an en-US system, which is the Windows-1252 code point for ä (which coincides with the Unicode codepoint, but the output is a single-byte-encoded file with no BOM).
If you use -Encoding ASCII explicitly, you get 0x3f, the code point for literal ?, because that's what using ASCII converts all non-ASCII chars. to.

[2] PetSerAl found the source-code location that shows that > and >> are effective aliases for Out-File [-Append], and he points out that redefining Out-File therefore also redefines > / >>; similarly, specifying a default encoding via $PSDefaultParameterValues for Out-File also takes effect for > / >>.
Windows PowerShell v5.1 is the minimum version that works this way..

Tip of the hat to PetSerAl for his help.

这篇关于Powershell的Out-File在文件顶部添加换行符-Out-File与Set-Content的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆