PowerShell out-file:防止编码更改 [英] PowerShell out-file: prevent encoding changes

查看:260
本文介绍了PowerShell out-file:防止编码更改的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在进行一些搜索和替换操作,我试图使用powershell进行自动化。不幸的是,我昨天发现我们在代码库(UTF8和ASCII)中有不同的文件编码。因为我们在不同的分支中进行这些搜索和替换操作,所以我无法在此阶段更改文件编码。

I'm currently working on some search and replace operation that I'm trying to automate using powershell. Unfortunately I recognized yesterday that we've different file encodings in our codebase (UTF8 and ASCII). Because we're doing these search and replace operations in a different branch I can't change the file encodings at this stage.

如果我运行以下行,将所有文件更改为UCS-2 Little Eindian,即使我的默认powerhell编码设置为iso-8859-1(西欧(Windows))。

If I'm running the following lines it changes all files to UCS-2 Little Eindian even though my default powershell encoding is set to iso-8859-1 (Western European (Windows)).

$content = Get-Content $_.Path
$content -replace 'myOldText' , 'myNewText' | Out-File $_.Path

有没有办法阻止powerhell更改文件的编码?

Is there a way to prevent powershell from changing the file's encoding?

推荐答案

Out-File 具有默认编码, code> -Encoding 参数:

Out-File has a default encoding unless overriden with the -Encoding parameter:

我为解决这个问题所做的是尝试通过阅读获得原始文件的编码尝试阅读它的字节顺序标记,并将其用作 -Encoding 参数值

What I've done to solve this is to try to get the original file's encoding by reading trying to read it's byte order mark and using it as the-Encoding parameter value.

这里是一个处理一堆文本文件路径的示例,获取原始编码,处理内容并将其写回文件原始的编码。

Here's an example processing a bunch of text file paths, getting the original encoding, processing the content and writing it back to file with the original's encoding.

function Get-FileEncoding {
    param ( [string] $FilePath )

    [byte[]] $byte = get-content -Encoding byte -ReadCount 4 -TotalCount 4 -Path $FilePath

    if ( $byte[0] -eq 0xef -and $byte[1] -eq 0xbb -and $byte[2] -eq 0xbf )
        { $encoding = 'UTF8' }  
    elseif ($byte[0] -eq 0xfe -and $byte[1] -eq 0xff)
        { $encoding = 'BigEndianUnicode' }
    elseif ($byte[0] -eq 0xff -and $byte[1] -eq 0xfe)
         { $encoding = 'Unicode' }
    elseif ($byte[0] -eq 0 -and $byte[1] -eq 0 -and $byte[2] -eq 0xfe -and $byte[3] -eq 0xff)
        { $encoding = 'UTF32' }
    elseif ($byte[0] -eq 0x2b -and $byte[1] -eq 0x2f -and $byte[2] -eq 0x76)
        { $encoding = 'UTF7'}
    else
        { $encoding = 'ASCII' }
    return $encoding
}

foreach ($textFile in $textFiles) {
    $encoding = Get-FileEncoding $textFile
    $content = Get-Content -Encoding $encoding
    # Process content here...
    $content | Set-Content -Path $textFile -Encoding $encoding
}

更新以下是使用StreamReader类获取原始文件编码的示例。该示例读取文件的前3个字节,以便根据其内部BOM检测例程的结果设置 CurrentEncoding 属性。

Update Here is an example of getting the original file encoding using the StreamReader class. The example reads the first 3 bytes of the file so that the CurrentEncoding property gets set based on the result of its internal BOM detection routine.

http://msdn.microsoft.com/en-us /library/9y86s1a9.aspx


detectEncodingFromByteOrderMarks参数检测
的编码,查看前三个字节流。它自动
识别UTF-8,小端Unicode和大端Unicode文本
如果文件以相应的字节顺序标记启动。否则,
使用UTF8Encoding。请参阅Encoding.GetPreamble方法获取更多
信息。

The detectEncodingFromByteOrderMarks parameter detects the encoding by looking at the first three bytes of the stream. It automatically recognizes UTF-8, little-endian Unicode, and big-endian Unicode text if the file starts with the appropriate byte order marks. Otherwise, the UTF8Encoding is used. See the Encoding.GetPreamble method for more information.

http://msdn.microsoft.com/en-us/library/system.text.encoding.getpreamble.aspx

$text = @" 
This is
my text file
contents.
"@

#Create text file.
[IO.File]::WriteAllText($filePath, $text, [System.Text.Encoding]::BigEndianUnicode)

#Create a stream reader to get the file's encoding and contents.
$sr = New-Object System.IO.StreamReader($filePath, $true)
[char[]] $buffer = new-object char[] 3
$sr.Read($buffer, 0, 3)  
$encoding = $sr.CurrentEncoding
$sr.Close()

#Show the detected encoding.
$encoding

#Update the file contents.
$content = [IO.File]::ReadAllText($filePath, $encoding)
$content2 = $content -replace "my" , "your"

#Save the updated contents to file.
[IO.File]::WriteAllText($filePath, $content2, $encoding)

#Display the result.
Get-Content $filePath

这篇关于PowerShell out-file:防止编码更改的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆