VB6/VBScript 将文件编码更改为 ansi [英] VB6/VBScript change file encoding to ansi
问题描述
我正在寻找一种将 UTF8 编码的文本文件转换为 ANSI 编码的方法.
I am looking for a way to convert a textfile with UTF8 encoding to ANSI encoding.
如何在 Visual Basic (VB6) 和/或 vbscript 中实现这一目标?
How can i go around and achieve this in Visual Basic (VB6) and or vbscript?
推荐答案
如果您的文件不是真的很大(例如,即使只有 40MB 也会非常缓慢),您可以使用以下 VB6、VBA 或 VBScript 代码来完成此操作:
If your files aren't truly enormous (e.g. even merely 40MB can be painfully slow) you can do this using the following code in VB6, VBA, or VBScript:
Option Explicit
Private Const adReadAll = -1
Private Const adSaveCreateOverWrite = 2
Private Const adTypeBinary = 1
Private Const adTypeText = 2
Private Const adWriteChar = 0
Private Sub UTF8toANSI(ByVal UTF8FName, ByVal ANSIFName)
Dim strText
With CreateObject("ADODB.Stream")
.Open
.Type = adTypeBinary
.LoadFromFile UTF8FName
.Type = adTypeText
.Charset = "utf-8"
strText = .ReadText(adReadAll)
.Position = 0
.SetEOS
.Charset = "_autodetect" 'Use current ANSI codepage.
.WriteText strText, adWriteChar
.SaveToFile ANSIFName, adSaveCreateOverWrite
.Close
End With
End Sub
UTF8toANSI "UTF8-wBOM.txt", "ANSI1.txt"
UTF8toANSI "UTF8-noBOM.txt", "ANSI2.txt"
MsgBox "Complete!", vbOKOnly, WScript.ScriptName
请注意,它会处理带有或不带有 BOM 的 UTF-8 输入文件.
Note that it will handle UTF-8 input files either with or without a BOM.
使用强类型和早期绑定将在 VB6 中提高性能,并且您不需要声明那些 Const 值.不过,这不是脚本中的选项.
Using strong typing and early binding will improve performance a hair in VB6, and you won't need to declare those Const values. This isn't an option in script though.
对于需要处理超大文件的 VB6 程序,最好使用 VB6 原生 I/O 处理字节数组,并使用 API 调用来转换块中的数据.这增加了查找字符边界的额外麻烦(UTF-8 使用每个字符的可变字节数).您需要扫描读取的每个数据块,以找到 API 转换的安全终点.
For VB6 programs that need to process very large files you might be better off using VB6 native I/O against Byte arrays and use an API call to convert the data in chunks. This adds the extra messiness of finding the character boundaries though (UTF-8 uses a variable number of bytes per character). You'd need to scan each data block you read to find a safe ending point for an API translation.
我会从 MultiByteToWideChar() 和 WideCharToMultiByte() 开始.
I'd look at MultiByteToWideChar() and WideCharToMultiByte() to get started.
请注意,UTF-8 通常到达"时带有 LF 行分隔符而不是 CRLF.
Note that UTF-8 often "arrives" with LF line delimiters instead of CRLF.
这篇关于VB6/VBScript 将文件编码更改为 ansi的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!