C#中布尔和char数据类型的Marshal.SizeOf和sizeof运算符的相反行为 [英] Opposite behavior of Marshal.SizeOf and sizeof operator for boolean and char data types in C#

查看:92
本文介绍了C#中布尔和char数据类型的Marshal.SizeOf和sizeof运算符的相反行为的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在将Cc中的Marshal.SizeOf API与sizeof运算符进行比较.它们对于char和bool数据类型的输出不足为奇.结果如下:

I was comparing Marshal.SizeOf API with sizeof operator in C#. Their outputs for char and bool data types are little surprising. Here are the results:

对于布尔值:

Marshal.SizeOf = 4

Marshal.SizeOf = 4

sizeof = 1

sizeof = 1

对于字符:

Marshal.SizeOf = 1

Marshal.SizeOf = 1

sizeof = 2

sizeof = 2

链接上从MSDN我收到以下文本:

On this link from MSDN I got following text:

对于所有其他类型(包括结构),sizeof运算符可以为 仅在不安全的代码块中使用.虽然您可以使用 Marshal.SizeOf方法,此方法返回的值并不总是 与sizeof返回的值相同. Marshal.SizeOf返回 封送类型后的大小,而sizeof返回 公共语言运行时分配的大小, 包括任何填充.

For all other types, including structs, the sizeof operator can be used only in unsafe code blocks. Although you can use the Marshal.SizeOf method, the value returned by this method is not always the same as the value returned by sizeof. Marshal.SizeOf returns the size after the type has been marshaled, whereas sizeof returns the size as it has been allocated by the common language runtime, including any padding.

我对封送处理的技术知识不甚了解,但是当事情发生变化时,它与运行时启发式技术有关.按照布尔的逻辑,大小从1更改为4.但是对于char(从2更改为1),这恰恰相反,这对我来说是个飞旋镖.我认为对于char来说,它也应该增加bool发生的方式.有人可以帮助我理解这些冲突的行为吗?

I do not know a lot about technicalities of Marshaling but it has something to do with Run-time heuristics when things change. Going by that logic for bool the size changes from 1 to 4. But for char (from 2 to 1) it is just the reverse which is a boomerang for me. I thought for char also it should also increase the way it happened for bool. Can some one help me understand these conflicting behaviors?

推荐答案

对不起,您真的要做必须考虑使这些选择有意义的技术性. Pinvoke的目标语言是C语言,这是一种现代标准的非常古老的语言,具有 lot 的历史记录,并且在不同计算机体系结构的 lot 中使用.它几乎没有关于类型大小的假设, byte 的概念不存在.这使得该语言非常容易移植到C语言发明时所通用的机器以及超级计算机和数字信号处理器中所使用的不寻常的体系结构.

Sorry, you really do have to consider the technicalities to make sense of these choices. The target language for pinvoke is the C language, a very old language by modern standards with a lot of history and used in a lot of different machine architectures. It makes very few assumptions about the size of a type, the notion of a byte does not exist. Which made the language very easy to port to the kind of machines that were common back when C was invented and the unusual architectures used in super-computers and digital signal processors.

C最初没有bool类型.逻辑表达式改用 int ,其中0表示 false ,其他任何值表示 true .它也使用到BOOL类型,它是 int 的别名,它也被延续到winapi中.因此4是合乎逻辑的选择.但是不是一个普遍的选择,您必须提防,许多C ++实现使用一个字节,COM Automation选择了两个字节.

C did not originally have a bool type. Logical expressions instead use int where a value of 0 represents false and any other value represents true. Also carried forward into the winapi, it does use a BOOL type which is an alias for int. So 4 was the logical choice. But not a universal choice and you have to watch out, many C++ implementations use a single byte, COM Automation chose two bytes.

C确实具有char类型,唯一的保证是它至少具有8位.不确定是签名的还是未签名的,当今大多数实现都使用签名.今天,在可以执行托管代码的体系结构上,对8位字节的支持已经很普遍,因此char在实践中始终为8位.所以1是合乎逻辑的选择.

C does have a char type, the only guarantee is that it has at least 8 bits. Whether it is signed or unsigned is unspecified, most implementations today use signed. Support for an 8-bit byte is universal today on the kind of architectures that can execute managed code so char is always 8 bits in practice. So 1 was the logical choice.

那并不能使您感到高兴,没有人对此感到高兴,您不能支持以8位字符类型的任意语言编写的文本. Unicode通过使用许多可能的8位编码来解决灾难,但是它对C和C ++语言没有太大影响.他们的委员会确实在标准中添加了wchar_t(宽字符),但是按照旧的惯例,他们并未确定标准的大小.这使它变得无用,迫使C ++以后添加char16_tchar32_t.但是,针对Windows的编译器始终是16位,因为这是操作系统选择的字符(又名WCHAR).它们不支持Unix的各种版本,而是支持utf8.

That doesn't make you happy, nobody is happy about it, you can't support text written in an arbitrary language with an 8-bit character type. Unicode came about to solve the disaster with the many possible 8-bit encodings that were in use but it did not have much of an affect on the C and C++ languages. Their committees did add wchar_t (wide character) to the standard but in keeping with old practices they did not nail down its size. Which made it useless, forcing C++ to later add char16_t and char32_t. It is however always 16 bits in compilers that target Windows since that is the operating system's choice for characters (aka WCHAR). It is not in the various Unix flavors, they favor utf8.

在C#中也能很好地工作,您不会被1个字节的字符所困扰. .NET框架中的每个单一类型都有一个带有CharSet属性的隐式[StructLayout]属性.默认值为CharSet.Ansi,与C语言默认值匹配.但是,您可以轻松地应用自己的代码并选择CharSet.Unicode.现在,您将使用utf16编码为每个字符获取两个字节,由于.NET也使用utf16,因此按原样复制了字符串.但是,请确保本机代码期望采用该编码形式的字符串.

That works well in C# too, you are not stuck with 1 byte characters. Every single type in the .NET framework has an implicit [StructLayout] attribute with a CharSet property. The default is CharSet.Ansi, matching the C language default. You can however easily apply your own and pick CharSet.Unicode. You now get two bytes per character, using the utf16 encoding, the string is copied as-is since .NET also uses utf16. Making sure that the native code expects strings in that encoding is however up to you.

这篇关于C#中布尔和char数据类型的Marshal.SizeOf和sizeof运算符的相反行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆