如何更改我的 Powershell 脚本,以便它以 ANSI-Windows-1252 编码写入输出文件? [英] How do I change my Powershell script so that it writes out-file in ANSI - Windows-1252 encoding?

查看:38
本文介绍了如何更改我的 Powershell 脚本,以便它以 ANSI-Windows-1252 编码写入输出文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个银行应用程序脚本,它通过从每日输入银行文件中删除错误记录来生成过滤"输出文件(请参阅 How我是否创建了一个 Windows Server 脚本来删除错误记录,以及从一个文件中删除每个错误记录,并将结果写入一个新文件).过滤后"的输出文件将被发送到国家以更新他们的系统.作为旁注,我们从银行收到的原始输入文件在我的文件编辑器 (UltraEdit) 中显示为 Unix 1252(ANSI Latin 1),并且每条记录仅以换行结束.

I have a banking application script that generates a "filtered" output file by removing error records from a daily input bank file (see How do I create a Windows Server script to remove error records, AND the previous record to each, from a file with the results written to a NEW file). The "filtered" output file will be sent to the State for updating their system. As a side note, the original input files that we receive from the bank show as Unix 1252 (ANSI Latin 1) in my file editor (UltraEdit), and each record ends only with a line feed.

我将几个由干净"(无错误)和脏"(包含 4 个错误)输入文件生成的测试输出文件发送到 State 进行测试,以确保在实施前一切正常,但是有点担心,因为输出文件是用带有 CRLF 行结尾的 UTF-16 编码生成的,其中输入和当前未过滤的输出是在 Windows-1252 中编码的.此系统上的所有其他输出文件均采用 Windows-1252 编码.

I sent a couple of test output files generated from both "clean" (no errors) and "dirty" (contained 4 errors) input files to the State for testing on their end to make sure all was good before implementation, but was a little concerned because the output files were generated in UTF-16 encoding with CRLF line endings, where the input and current unfiltered output are encoded in Windows-1252. All other output files on this system are Windows-1252 encoded.

果然......我得到消息说编码对于国家系统是不正确的.他们的评论是:该文件采用 UCS-2 Little Endian 编码,需要转换为 ANSI 才能在我们的系统上运行.这是出乎意料的.

Sure enough… I got word back that the encoding is incorrect for the state’s system. Their comments were: "The file was encoded UCS-2 Little Endian and needed to be converted to ANSI to run on our system. That was unexpected.

之后,没有详细交易的文件将通过我们的 EFT 拒绝程序运行.

After that the file with no detail transactions would run through our EFT rejects program ok.

看起来处理得还可以,但我们不得不做一些转换.可以用ANSI发送还是需要用UCS 2 Little Endian来完成?"

It seems that it was processed ok, but we had to do some conversion. Can it be sent in ANSI or needs to be done in UCS 2 Little Endian?"

我曾尝试将 –Encoding Windows-1252" 和 –Encoding windows-1252 添加到我的输出文件语句中,但均未成功,但都返回了以下消息:输出文件:无法验证参数编码"上的参数.论据Windows-1252"不属于集合未知,字符串,unicode,bigendianunicode,utf8,utf7,utf32,ascii,默认,oem"由 ValidateSet 属性指定.提供集合中的参数然后再次尝试该命令.在 C:\EZTRIEVE\PwrShell\TEST2_FilterR02.ps1:47 char:57+ ... 输出字符串 |Out-File $OutputFileFiltered -Encoding "Windows-1252"+ ~~~~~~~~~~~~~~+ CategoryInfo : InvalidData: (:) [Out-File], ParameterBindingValidationException+ FullQualifiedErrorId : ParameterArgumentValidationError,Microsoft.PowerShell.Commands.OutFileCommand

I have tried unsuccessfully adding –Encoding "Windows-1252" and –Encoding windows-1252 to my out-file statement, with both returning the message: Out-File : Cannot validate argument on parameter 'Encoding'. The argument "Windows-1252" does not belong to the set "unknown,string,unicode,bigendianunicode,utf8,utf7,utf32,ascii,default,oem" specified by the ValidateSet attribute. Supply an argument that is in the set and then try the command again. At C:\EZTRIEVE\PwrShell\TEST2_FilterR02.ps1:47 char:57 + ... OutputStrings | Out-File $OutputFileFiltered -Encoding "Windows-1252" + ~~~~~~~~~~~~~~ + CategoryInfo : InvalidData: (:) [Out-File], ParameterBindingVal idationException + FullyQualifiedErrorId : ParameterArgumentValidationError,Microsoft.Power Shell.Commands.OutFileCommand

几天来我一直在寻找一些帮助,但没有什么是真正清楚的,而且我发现的绝大多数内容都涉及从 Windows-1252 转换为另一种编码.昨天,我在 stackoverflow 上的某处发现了一条评论,说ANSI"与 Windows-1252 相同,但到目前为止,我还没有找到任何内容向我展示如何将 Windows-1252 编码选项正确附加到我的输出文件语句中,因此Powershell 会接受它.我真的需要完成这个项目,这样我才能处理接下来添加到我的队列中的几个项目.是否可能缺少一个需要附加到 –Encoding 的子参数?

I’ve looked high and low for some help with this for days, but nothing is really clear, and the vast majority of what I found, involved converting FROM Windows-1252 TO another encoding. Yesterday, I found a comment somewhere on stackoverflow that "ANSI" is the same as Windows-1252, but so far, I have not found anything that shows me how to properly append the Windows-1252 encoding option to my out-file statement so Powershell will accepted it. I really need to get this project finished so I can tackle the next several that have been added to my queue. Is there possibly a subparameter that I’m missing that needs to be appended to –Encoding?

这是在 Dollar Universe(作业调度程序)下在运行 Windows Server 2016 Standard 和 Powershell 5.1 的新备份服务器上进行测试的.我们的生产系统在 Windows Server 2012 R2 和 Powershell 5.1 上运行 Dollar Universe(是的,我们正在寻找足够的升级窗口 :-)

This is being tested under Dollar Universe (job scheduler) on a new backup server running Windows Server 2016 Standard with Powershell 5.1. Our production system runs Dollar Universe on Windows Server 2012 R2, also with Powershell 5.1 (yes, we are looking for a sufficient upgrade window :-)

截至我上次尝试时,我的 Powershell 脚本是:

As of my last attempt, my Powershell script is :

 [cmdletbinding()]
 Param
 (
     [string] $InputFilePath
 )   

 # Read the text file
 $InputFile = Get-Content $InputFilePath

# Initialize output record counter
$Inrecs = 0
$Outrecs = 0

# Get the time
$Time = Get-Date -Format "MM_dd_yy"

# Set up the output file name
$OutputFileFiltered = "C:\EZTRIEVE\CFIS\DATA\TEST_CFI_EFT_RETURN_FILTERED"

# Initialize the variable used to hold the output
$OutputStrings = @()

# Loop through each line in the file
# Check the line ahead for "R02" and add it to the output
# or skip it appropriately
for ($i = 0; $i -lt $InputFile.Length - 1; $i++)
{
    if ($InputFile[$i + 1] -notmatch "R02")
    {
        # The next record does not contain "R02", increment count and add it to the output
        $Outrecs++
        $OutputStrings += $InputFile[$i]
    }
    else
    {
        # The next record does contain "R02", skip it
        $i++
    }
}

# Add the trailer record to the output
$OutputString += $InputFile[$InputFile.Length - 1]

# Write the output to a file
# $OutputStrings | Out-File $OutputFileFiltered
$OutputStrings | Out-File $OutputFileFiltered -Encoding windows-1252

# Display record processing stats:

$Filtered = $Outrecs-$i

Write-Host $i  Input records processed

Write-Host $Filtered  Error records filtered out

Write-Host $Outrecs  Output records written

推荐答案

注意:

  • 您后来澄清说您需要 LF(Unix 风格)换行符 - 请参阅底部部分.

  • You later clarified that you need LF (Unix-style) newlines - see the bottom section.

下一部分处理最初提出的问题,并提供导致文件带有 CRLF(Windows 样式)换行符(在 Windows 上运行时)的解决方案.

The next section deals with the question as originally asked and presents solutions that result in files with CRLF (Windows-style) newlines (when run on Windows).

如果您系统的非 Unicode 程序的语言设置(又名系统区域设置)恰好具有 Windows-1252 作为活动的 ANSI 代码页(例如,在美国英语或西欧系统上),使用 -Encoding Default,因为 Default 在 Windows PowerShell 中引用该代码页(但在 PowerShell 中不是Core,默认为无 BOM 的 UTF-8,不支持 Default 编码标识符.

If your system's Language for non-Unicode programs setting (a.k.a. the system locale) happens to have Windows-1252 as the active ANSI code page (e.g, on US-English or Western European systems), use -Encoding Default, because Default refers to that code page in Windows PowerShell (but not in PowerShell Core, which defaults to BOM-less UTF-8 and doesn't support the Default encoding identifier).

验证:(Get-ItemPropertyValue HKLM:\SYSTEM\CurrentControlSet\Control\Nls\CodePage ACP) -eq '1252'

... | Out-File -Encoding Default $file

注意:

  • 如果您确定您的数据实际上完全由 ASCII 范围字符(代码点在 7 位范围内的字符组成,不包括重音字符,例如 ü), -Encoding Default 即使您的系统区域设置使用 ANSI 代码页 other 而不是 Windows-1252,鉴于所有(单字节)ANSI 代码页共享其 7 位子范围内的所有 ASCII 字符;然后您也可以使用 -Encoding ASCII,但请注意,如果毕竟存在非 ASCII 字符,它们将被音译为 literal ? 字符,导致信息丢失.

  • If you are certain that your data is actually composed exclusively of ASCII-range characters (characters with code points in the 7-bit range, which excludes accented characters such as ü), -Encoding Default will work even if your system locale uses an ANSI code page other than Windows-1252, given that all (single-byte) ANSI code pages share all ASCII characters in their 7-bit subrange; you could then also use -Encoding ASCII, but note that if there are non-ASCII characters present after all, they will be transliterated to literal ? chars., resulting in loss of information.

Set-Content cmdlet 实际上默认为 Windows PowerShell 中的 Default 编码(但不是 PowerShell Core,其中一致的默认值是没有 BOM 的 UTF-8).

The Set-Content cmdlet actually defaults to the Default encoding in Windows PowerShell (but not PowerShell Core, where the consistent default is UTF-8 without BOM).

虽然 Set-Content 的字符串化行为与 Out-File 的不同 - 看这个答案 - 如果要写入文件的对象已经字符串,那实际上是更好的选择.

While Set-Content's stringification behavior differs from that of Out-File - see this answer - it's actually the better choice if the objects to write to the file already are strings.

否则,您有两个选择:

  • 直接使用 .NET Framework 文件 I/O 功能,您可以在其中使用 .NET 支持的任何编码;例如:

  • Use the .NET Framework file I/O functionality directly, where you can use any encoding supported by .NET; e.g.:

  $lines = ...  # array of strings (to become lines in a file)
  # CAVEAT: Be sure to specify an *absolute file path* in $file,
  #         because .NET typically has a different working dir.
  [IO.File]::WriteAllLines($file, $lines, [Text.Encoding]::GetEncoding(1252))

  • 使用 PowerShell Core,它允许您将任何支持的 .NET 编码传递给
    -Encoding 参数:

      ... | Out-File -Encoding ([Text.Encoding]::GetEncoding(1252)) $file
    

  • 请注意,在 PSv5.1+ 中,您实际上可以更改 >>> 使用的编码运算符,详见这个答案.
    但是,在 Windows PowerShell 中,您再次受限于 Out-File-Encoding 参数支持的编码.

    Note that in PSv5.1+ you can actually change the encoding used by the > and >> operators, as detailed in this answer.
    However, in Windows PowerShell you are again limited to the encodings supported by Out-File's -Encoding parameter.

    在 Windows 上使用 LF(Unix 风格)换行符创建文本文件:

    PowerShell(总是)和 .NET(默认情况下)使用适合平台的换行序列 - 如 [Environment]::NewLine 中所反映 - 在将字符串作为行写入文件时.换句话说:在 Windows 上,您最终会得到带有 CRLF 换行符的文件,而在类 Unix 平台(PowerShell Core)上会得到带有 LF 换行符的文件.

    PowerShell (invariably) and .NET (by default) use the platform-appropriate newline sequence - as reflected in [Environment]::NewLine - when writing strings as lines to a file. In other words: on Windows you'll end up with files with CRLF newlines, and on Unix-like platforms (PowerShell Core) with LF newlines.

    请注意,下面的解决方案假设要写入文件的数据是一个 字符串数组,表示要写入的行,由 Get-Content 返回,例如(其中结果数组元素是输入文件的行,没有它们的尾随换行符序列).

    Note that the solutions below assume that the data to write to your file is an array of strings that represent the lines to write, as returned by Get-Content, for instance (where the resulting array elements are the input file's lines without their trailing newline sequence).

    要在 Windows (PSv5+) 上显式创建带有 LF 换行符的文件:

    To explicitly create a file with LF newlines on Windows (PSv5+):

    $lines = ...  # array of strings (to become lines in a file)
    
    ($lines -join "`n") + "`n" | Set-Content -NoNewline $file
    

    "`n" 产生一个 LF 字符.

    "`n" produces a LF character.

    注意:

    • 在 Windows PowerShell 中,这隐式使用活动 ANSI 代码页的编码.

    • In Windows PowerShell this implicitly uses the active ANSI code page's encoding.

    在 PowerShell Core 中,这会隐式创建一个没有 BOM 的 UTF-8 文件.如果您想改用活动的 ANSI 代码页,请使用:

    In PowerShell Core this implicitly creates a UTF-8 file without BOM. If you want to use the active ANSI code page instead, use:

    -Encoding ([Text.Encoding]::GetEncoding([int] (Get-ItemPropertyValue HKLM:\SYSTEM\CurrentControlSet\Control\Nls\CodePage ACP)))
    

    PSv4-(PowerShell 版本 4 或更低版本)中,您必须直接使用 .NET Framework:

    In PSv4- (PowerShell version 4 or lower), you'll have to use the .NET Framework directly:

    $lines = ...  # array of strings (to become lines in a file)
    
    
    # CAVEAT: Be sure to specify an *absolute file path* in $file,
    #         because .NET typically has a different working dir.
    [IO.File]::WriteAllText($file, ($lines -join "`n") + "`n")
    

    注意:

    • 在 Windows PowerShell 和 PowerShell Core 中,这会创建一个没有 BOM 的 UTF-8 文件.

    • In both Windows PowerShell and PowerShell Core this creates a UTF-8 file without BOM.

    如果您想改用活动的 ANSI 代码页,请将以下内容作为附加参数传递给 WriteAllText():

    If you want to use the active ANSI code page instead, pass the following as an additional argument to WriteAllText():

    ([Text.Encoding]::GetEncoding([int] (Get-ItemPropertyValue HKLM:\SYSTEM\CurrentControlSet\Control\Nls\CodePage ACP)))
    

    这篇关于如何更改我的 Powershell 脚本,以便它以 ANSI-Windows-1252 编码写入输出文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆