从字符数组转换时的字符串长度 [英] String length when converting from a character array
问题描述
我在处理字符串方面遇到了严重的问题.由于我的问题很难描述,我将从一些重现它们的演示代码开始:
Dim s1 As String = "hi"Dim c(30) As Charc(0) = "h"c(1) = "我"Dim s2 As String = CStr(c)s2 = s2.Trim()如果不是 s1 = s2 那么MsgBox(s1 + " != " + s2 + Environment.NewLine + _无论如何都不会打印这里的任何内容......" + Environment.NewLine + _"s1.length: " + s1.Length.ToString + Environment.NewLine + _"s2.length: " + s2.Length.ToString + Environment.NewLine)万一
结果消息框如下所示:
这个比较失败的原因是 s2 的长度为 31(来自原始数组大小)而 s1 的长度为 2.
在从字节数组中读取字符串信息时,我经常遇到此类问题,例如在处理来自 MP3 的 ID3Tag 或其他具有预先指定长度的编码(ASCII、UTF8...)信息时.
有没有什么快速、干净的方法来防止这个问题?
将 s2修剪"为调试器显示的字符串的最简单方法是什么?
为了清楚起见,我更改了变量名称:
Dim myChars(30) As CharmyChars(0) = "h"c ' 无法将字符串转换为字符myChars(1) = "i"c ' 在选项严格下(缩小)Dim myStrA As New String(myChars)Dim myStrB As String = CStr(myChars)
简短的回答是:
在幕后,字符串是字符数组.最后两行都创建了一个字符串,一个是使用 NET 代码,另一个是 VB 函数.问题是,尽管数组有 31 个元素,只初始化了 2 个:
其余的都是 null/Nothing,对于 Char
意味着 Chr(0)
或 NUL
.由于NUL
用于标记String
的结尾,因此只有NUL
之前的字符将打印在Console中code>、
MessageBox
等.附加到字符串的文本也不会显示.
概念
由于上面的字符串是直接从一个字符数组创建的,所以长度是原始数组的长度.Nul
是一个有效的 char
,因此它们会被添加到字符串中:
Console.WriteLine(myStrA.Length) ' == 31
那么,为什么 Trim
不删除空字符?MSDN(和 Intellisense)告诉我们:
[Trim] 从当前 String 对象中删除所有前导和尾随空白字符.
尾随的 null/Chr(0) 字符不是像 Tab、Lf、Cr 或 Space 那样的空白,而是 控制字符.
但是,String.Trim
有一个 overload 允许你指定要删除的字符:
myStrA = myStrA.Trim(Convert.ToChar(0))' 使用 VB 命名空间常量myStrA = myStrA.Trim( Microsoft.VisualBasic.ControlChars.NullChar)
您可以指定多个字符:
' 空值和空格:myStrA = myStrA.Trim(Convert.ToChar(0), " "c)
<小时>
字符串可以作为字符数组被索引/迭代:
For n As Int32 = 0 to myStrA.LengthConsole.Write("{0} 是 '{1}'", n, myStrA(n)) ' 或 myStrA.Chars(n)下一个
<块引用>
0 是 'h'
1 是我"
2 是 '
(输出窗口甚至不会打印尾随的 CRLF.)但是,您不能更改字符串的字符数组来更改字符串数据:
myStrA(2) = "!"c
这不会编译,因为它们是只读的.
另见:
I'm having serious problems with string-handling. As my problems are rather hard to describe, I will start with some demo code reproducing them:
Dim s1 As String = "hi"
Dim c(30) As Char
c(0) = "h"
c(1) = "i"
Dim s2 As String = CStr(c)
s2 = s2.Trim()
If not s1 = s2 Then
MsgBox(s1 + " != " + s2 + Environment.NewLine + _
"Anything here won't be printed anyway..." + Environment.NewLine + _
"s1.length: " + s1.Length.ToString + Environment.NewLine + _
"s2.length: " + s2.Length.ToString + Environment.NewLine)
End If
The result messagebox looks like this:
The reason that this comparison fails is that s2 has the length 31 (from the original array-size) while s1 has the length 2.
I stumble over this kind of problem quite often when reading string-information out of byte-arrays, for example when handling ID3Tags from MP3s or other encoded (ASCII, UTF8, ...) information with pre-specified length.
Is there any fast and clean way to prevent this problem?
What is the easiest way to "trim" s2 to the string shown by the debugger?
I changed the variable names for clarity:
Dim myChars(30) As Char
myChars(0) = "h"c ' cannot convert string to char
myChars(1) = "i"c ' under option strict (narrowing)
Dim myStrA As New String(myChars)
Dim myStrB As String = CStr(myChars)
The short answer is this:
Under the hood, strings are character arrays. The last 2 lines both create a string one using NET code, the other a VB function. The thing is that, although the array has 31 elements, only 2 were initialized:
The rest are null/Nothing, which for a Char
means Chr(0)
or NUL
. Since NUL
is used to mark the end of a String
, only the characters up to that NUL
will print in the Console
, MessageBox
etc. Text appended to the string will not display either.
Concepts
Since the strings above are created directly from a char array, the length is that of the original array. The Nul
is a valid char
so they get added to the string:
Console.WriteLine(myStrA.Length) ' == 31
So, why doesn't Trim
remove the nul characters? MSDN (and Intellisense) tells us:
[Trim] Removes all leading and trailing white-space characters from the current String object.
The trailing null/Chr(0) characters are not white-space like Tab, Lf, Cr or Space, but is a control character.
However, String.Trim
has an overload which allows you to specify the characters to remove:
myStrA = myStrA.Trim(Convert.ToChar(0))
' using VB namespace constant
myStrA = myStrA.Trim( Microsoft.VisualBasic.ControlChars.NullChar)
You can specify multiple chars:
' nuls and spaces:
myStrA = myStrA.Trim(Convert.ToChar(0), " "c)
Strings can be indexed / iterated as a char array:
For n As Int32 = 0 To myStrA.Length
Console.Write("{0} is '{1}'", n, myStrA(n)) ' or myStrA.Chars(n)
Next
0 is 'h'
1 is 'i'
2 is '
(The output window will not even print the trailing CRLF.) You cannot change the string's char array to change the string data however:
myStrA(2) = "!"c
This will not compile because they are read-only.
See also:
这篇关于从字符数组转换时的字符串长度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!