使用vba从汉字的文本文件中提取文本 [英] Extract text from a text file with Chinese characters using vba
问题描述
过去我一直这样做没有问题:
Sub Main()
Dim PathAndName As String
Dim TextFile As Integer
Dim TextString()As String
Redim TextString(100000)
对于i = 1至100,000
PathAndName =C:\\ \\ File_&我& .ext
TextFile = 1
打开PathAndName作为TextFile输入
TextString(i)= Input(LOF(TextFile),TextFile)
下一个i
结束Sub
这一次,脚本返回错误输入结束文件错误62.
我唯一可以看到的不同之处在于,这一次文本文件包含几个汉字,这实际上不是我的兴趣。这就是为什么我相信这是问题的根源。
中文字符出现在文件的第一行。
任何帮助都不胜感激。谢谢!
我怀疑你的文本文件现在是一个多字节编码。一个字符被编码为两个或三个字节。所以 LOF(TextFile)
不会返回正确的字符数,而是字节计数。但是 Input(LOF(TextFile),TextFile)
需要字符计数,因为它必须创建一个 String
。 >
您可以使用:
Sub Main()
Dim PathAndName As String
Dim TextFile As Integer
Dim TextString()As String
Redim TextString(100000)
对于i = 1至100000
PathAndName =C:\ File_&我& .ext
TextFile = 1
打开PathAndName作为TextFile输入
Dim sLine As String
Dim sTextString As String
sLine =
sTextString =
尽管不是EOF(TextFile)
输入#TextFile,sLine
sTextString = sTextString& sLine
循环
TextString(i)= sTextString
关闭#TextFile
下一个i
End Sub
但是更好的方法是使用 ADODB.Stream
的恐龙VB文件访问方法。但这是一个完全不同的做法。所以你应该首先阅读 ADODB.Stream
。
I have a batch of like 100,000 text files which I would like to extract as strings using vba. In the past I have been doing so this way without problem:
Sub Main()
Dim PathAndName As String
Dim TextFile As Integer
Dim TextString() As String
Redim TextString(100000)
For i = 1 To 100,000
PathAndName = "C:\File_" & i & ".ext"
TextFile = 1
Open PathAndName For Input As TextFile
TextString(i) = Input(LOF(TextFile), TextFile)
Next i
End Sub
This time, the script returns the error "Input Past End of File" Error 62. The only different I can spot is that this time the text files contain a few Chinese Characters, which are not of my interest actually. That's why I believe this is the source of the problem. The Chinese Characters appear at the first line of the files.
Any help is appreciated. Thanks!
I suspect your text file is in a multibyte encoding now. There one character is encoded in two or three bytes. So LOF(TextFile)
will not return the correct character count but the byte count. But Input(LOF(TextFile), TextFile)
needs the character count since it must create a String
.
You could use:
Sub Main()
Dim PathAndName As String
Dim TextFile As Integer
Dim TextString() As String
Redim TextString(100000)
For i = 1 To 100000
PathAndName = "C:\File_" & i & ".ext"
TextFile = 1
Open PathAndName For Input As TextFile
Dim sLine As String
Dim sTextString As String
sLine = ""
sTextString = ""
Do While Not EOF(TextFile)
Input #TextFile, sLine
sTextString = sTextString & sLine
Loop
TextString(i) = sTextString
Close #TextFile
Next i
End Sub
But the better approach would be using ADODB.Stream
instead of the dinosaur VB file access methods. But this is a totally different approach. So you should read about ADODB.Stream
yourself first.
这篇关于使用vba从汉字的文本文件中提取文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!