为什么Powershell文件串联会将UTF8转换为UTF16? [英] Why does Powershell file concatenation convert UTF8 to UTF16?

查看:157
本文介绍了为什么Powershell文件串联会将UTF8转换为UTF16?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在运行以下Powershell脚本,将一系列输出文件连接为一个CSV文件. whidataXX.htm(其中xx是两位数字序号),创建的文件数因运行而异.

I am running the following Powershell script to concatenate a series of output files into a single CSV file. whidataXX.htm (where xx is a two digit sequential number) and the number of files created varies from run to run.

$metadataPath = "\\ServerPath\foo" 

function concatenateMetadata {
    $cFile = $metadataPath + "whiconcat.csv"
    Clear-Content $cFile
    $metadataFiles = gci $metadataPath
    $iterations = $metadataFiles.Count
    for ($i=0;$i -le $iterations-1;$i++) {
        $iFile = "whidata"+$i+".htm"
        $FileExists = (Test-Path $metadataPath$iFile -PathType Leaf)
        if (!($FileExists))
        {
            break
        }
        elseif ($FileExists)
        {
            Write-Host "Adding " $metadataPath$iFile
            Get-Content $metadataPath$iFile | Out-File $cFile -append
            Write-Host "to" $cfile
        }
    }
} 

whidataXX.htm文件编码为UTF8,但是我的输出文件编码为UTF16.在记事本中查看文件时,它看起来正确,但是在十六进制编辑器中查看时,十六进制值00出现在每个字符之间,并且当我将文件拉入Java程序进行处理时,文件将打印到在控制台上,在c h a r a c t e r s之间留有多余的空格.

The whidataXX.htm files are encoded UTF8, but my output file is encoded UTF16. When I view the file in Notepad, it appears correct, but when I view it in a Hex Editor, the Hex value 00 appears between each character, and when I pull the file into a Java program for processing, the file prints to the console with extra spaces between c h a r a c t e r s.

首先,PowerShell是否正常?还是源文件中会导致这种情况?

First, is this normal for PowerShell? or is there something in the source files that would cause this?

第二,如何解决上面提到的代码中的编码问题?

Second, how would I fix this encoding problem in the code noted above?

推荐答案

Out- * cmdlet(如Out-File)格式化数据,默认格式为unicode.

The Out-* cmdlets (like Out-File) format the data, and the default format is unicode.

您可以在输出文件中添加-Encoding参数:

You can add an -Encoding parameter to Out-file:

Get-Content $metadataPath$iFile | Out-File $cFile -Encoding UTF8 -append

或切换到不会重新格式化的添加内容

or switch to Add-Content, which doesn't re-format

Get-Content $metadataPath$iFile | Add-Content $cFile 

这篇关于为什么Powershell文件串联会将UTF8转换为UTF16?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆