VB6 / VBScript将文件编码更改为ansi [英] VB6/VBScript change file encoding to ansi
问题描述
我正在寻找一种将UTF8编码的文本文件转换为ANSI编码的方法。
I am looking for a way to convert a textfile with UTF8 encoding to ANSI encoding.
如何在Visual Basic(VB6)和或者vbscript?
How can i go around and achieve this in Visual Basic (VB6) and or vbscript?
推荐答案
如果您的文件不是真正的巨大的(例如,即使只有40MB可能会很慢),您可以这样做在VB6,VBA或VBScript中使用以下代码:
If your files aren't truly enormous (e.g. even merely 40MB can be painfully slow) you can do this using the following code in VB6, VBA, or VBScript:
Option Explicit
Private Const adReadAll = -1
Private Const adSaveCreateOverWrite = 2
Private Const adTypeBinary = 1
Private Const adTypeText = 2
Private Const adWriteChar = 0
Private Sub UTF8toANSI(ByVal UTF8FName, ByVal ANSIFName)
Dim strText
With CreateObject("ADODB.Stream")
.Open
.Type = adTypeBinary
.LoadFromFile UTF8FName
.Type = adTypeText
.Charset = "utf-8"
strText = .ReadText(adReadAll)
.Position = 0
.SetEOS
.Charset = "_autodetect" 'Use current ANSI codepage.
.WriteText strText, adWriteChar
.SaveToFile ANSIFName, adSaveCreateOverWrite
.Close
End With
End Sub
UTF8toANSI "UTF8-wBOM.txt", "ANSI1.txt"
UTF8toANSI "UTF8-noBOM.txt", "ANSI2.txt"
MsgBox "Complete!", vbOKOnly, WScript.ScriptName
请注意,它将处理带或不带BOM的UTF-8输入文件。
Note that it will handle UTF-8 input files either with or without a BOM.
使用强大的打字和早期绑定将提高VB6中的头发的性能,您不需要声明那些Const值。对于需要处理非常大的文件的VB6程序,您可能会更好地使用VB6本机I / O对字节数组,而不是脚本中的选项。
Using strong typing and early binding will improve performance a hair in VB6, and you won't need to declare those Const values. This isn't an option in script though.
使用API调用来转换块中的数据。这增加了找到字符边界的额外的麻烦(UTF-8使用每个字符的可变字节数)。您需要扫描您读取的每个数据块,才能找到API翻译的安全终点。
For VB6 programs that need to process very large files you might be better off using VB6 native I/O against Byte arrays and use an API call to convert the data in chunks. This adds the extra messiness of finding the character boundaries though (UTF-8 uses a variable number of bytes per character). You'd need to scan each data block you read to find a safe ending point for an API translation.
我会看到MultiByteToWideChar()和WideCharToMultiByte()到开始。
I'd look at MultiByteToWideChar() and WideCharToMultiByte() to get started.
请注意,UTF-8经常使用LF行分隔符而不是CRLF到达。
Note that UTF-8 often "arrives" with LF line delimiters instead of CRLF.
这篇关于VB6 / VBScript将文件编码更改为ansi的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!