VB6/VBScript 将文件编码更改为 ansi [英] VB6/VBScript change file encoding to ansi

查看:15
本文介绍了VB6/VBScript 将文件编码更改为 ansi的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一种将 UTF8 编码的文本文件转换为 ANSI 编码的方法.

I am looking for a way to convert a textfile with UTF8 encoding to ANSI encoding.

如何在 Visual Basic (VB6) 和/或 vbscript 中实现这一目标?

How can i go around and achieve this in Visual Basic (VB6) and or vbscript?

推荐答案

如果您的文件不是真的很大(例如,即使只有 40MB 也会非常缓慢),您可以使用以下 VB6、VBA 或 VBScript 代码来完成此操作:

If your files aren't truly enormous (e.g. even merely 40MB can be painfully slow) you can do this using the following code in VB6, VBA, or VBScript:

Option Explicit

Private Const adReadAll = -1
Private Const adSaveCreateOverWrite = 2
Private Const adTypeBinary = 1
Private Const adTypeText = 2
Private Const adWriteChar = 0

Private Sub UTF8toANSI(ByVal UTF8FName, ByVal ANSIFName)
    Dim strText

    With CreateObject("ADODB.Stream")
        .Open
        .Type = adTypeBinary
        .LoadFromFile UTF8FName
        .Type = adTypeText
        .Charset = "utf-8"
        strText = .ReadText(adReadAll)
        .Position = 0
        .SetEOS
        .Charset = "_autodetect" 'Use current ANSI codepage.
        .WriteText strText, adWriteChar
        .SaveToFile ANSIFName, adSaveCreateOverWrite
        .Close
    End With
End Sub

UTF8toANSI "UTF8-wBOM.txt", "ANSI1.txt"
UTF8toANSI "UTF8-noBOM.txt", "ANSI2.txt"
MsgBox "Complete!", vbOKOnly, WScript.ScriptName

请注意,它会处理带有或不带有 BOM 的 UTF-8 输入文件.

Note that it will handle UTF-8 input files either with or without a BOM.

使用强类型和早期绑定将在 VB6 中提高性能,并且您不需要声明那些 Const 值.不过,这不是脚本中的选项.

Using strong typing and early binding will improve performance a hair in VB6, and you won't need to declare those Const values. This isn't an option in script though.

对于需要处理超大文件的 VB6 程序,最好使用 VB6 原生 I/O 处理字节数组,并使用 API 调用来转换块中的数据.这增加了查找字符边界的额外麻烦(UTF-8 使用每个字符的可变字节数).您需要扫描读取的每个数据块,以找到 API 转换的安全终点.

For VB6 programs that need to process very large files you might be better off using VB6 native I/O against Byte arrays and use an API call to convert the data in chunks. This adds the extra messiness of finding the character boundaries though (UTF-8 uses a variable number of bytes per character). You'd need to scan each data block you read to find a safe ending point for an API translation.

我会从 MultiByteToWideChar() 和 WideCharToMultiByte() 开始.

I'd look at MultiByteToWideChar() and WideCharToMultiByte() to get started.

请注意,UTF-8 通常到达"时带有 LF 行分隔符而不是 CRLF.

Note that UTF-8 often "arrives" with LF line delimiters instead of CRLF.

这篇关于VB6/VBScript 将文件编码更改为 ansi的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆