我可以把unsigned char型转换成char,反之亦然? [英] Can I turn unsigned char into char and vice versa?

查看:281
本文介绍了我可以把unsigned char型转换成char,反之亦然?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用预计像这样的数据的功能:

I want to use a function that expects data like this:

void process(char *data_in, int data_len);

所以它只是处理一些字节真的。

So it's just processing some bytes really.

但我更舒服unsigned char型的工作,当涉及到原始字节(它在某种程度上感觉更只处理正0到255之间的值右),所以我的问题是:

But I'm more comfortable working with "unsigned char" when it comes to raw bytes (it somehow "feels" more right to deal with positive 0 to 255 values only), so my question is:

我可以始终安全传递无符号字符* 进入该功能?

Can I always safely pass a unsigned char * into this function?

在换句话说:


  • 它是保证我可以放心地char和unsigned char型之间的转换(投)随意,没有丢失任何信息?

  • 我可以放心地指向char和随意unsigned char型之间的转换(CAST),没有信息?任何损失

奖励:在C和C ++相同的答案

Bonus: Is the answer same in C and C++?

推荐答案

简短的回答是肯定的,如果您使用显式类型转换,但在细节来解释它,有三个方面来看待:

The short answer is yes if you use an explicit cast, but to explain it in detail, there are three aspects to look at:

转换 1)合法性结果
之间的转换签订T * 无符号的T * (对于某些类型的 T )通常是可能的,因为源类型可先被转换为空隙* (这是一个标准的转换,§4.10),并且无效* 可转换为使用目标类型的显式的static_cast (§5.2.9/ 13):

1) Legality of the conversion
Converting between signed T* and unsigned T* (for some type T) in either direction is generally possible because the source type can first be converted to void * (this is a standard conversion, §4.10), and the void * can be converted to the destination type using an explicit static_cast (§5.2.9/13):

static_cast<unsigned char*>(static_cast<void *>(data_in))

这可以简写(§5.2.10/ 7)

This can be abbreviated (§5.2.10/7) as

reinterpret_cast<unsigned char *>(data_in)

由于字符是一个标准的布局类型(§3.9.1/ 7,8和§3.9/ 9)和符号性不会改变对齐(§3.9.1 / 1)。它也可以写成一个C风格的转换:

because char is a standard-layout type (§3.9.1/7,8 and §3.9/9) and signedness does not change alignment (§3.9.1/1). It can also be written as a C-style cast:

(unsigned char *)(data_in)

此外,这两种方式都可以,从无符号* 签署* 和背部。也有保证,如果你申请这个程序的一种方式,然后回来,指针的值(即它指向的地址)不会改变(§5.2.10/ 7)。

Again, this works both ways, from unsigned* to signed* and back. There is also a guarantee that if you apply this procedure one way and then back, the pointer value (i.e. the address it's pointing to) won't have changed (§5.2.10/7).

这一切不仅适用于符号字符* 无符号字符* ,也给<之间的转换code>的char * / 无符号字符* 的char * / 符号字符* ,分别为。 (字符符号字符 unsigned char型在形式上3不同的类型,§3.9.1/ 1。)

All of this applies not only to conversions between signed char * and unsigned char *, but also to char */unsigned char * and char */signed char *, respectively. (char, signed char and unsigned char are formally three distinct types, §3.9.1/1.)

需要明确的是,它并不重要,其中使用三投的方法,但是你必须使用一个。只是一个传递指针将无法正常工作,作为转换,而法律,是不是一个标准的转换,这样就不会被隐式执行(如果你试图编译器将发出一个错误)。

To be clear, it doesn't matter which of the three cast-methods you use, but you must use one. Merely passing the pointer will not work, as the conversion, while legal, is not a standard conversion, so it won't be performed implicitly (the compiler will issue an error if you try).

的访问值 2)良好definedness 结果
会发生什么,如果,在函数内部,取消引用指针,即执行 * DATA_IN 来检索的基本字符glvalue;这是明确定义和法律?这里的有关规则是严格走样规则(§3.10/ 10):

2) Well-definedness of the access to the values
What happens if, inside the function, you dereference the pointer, i.e. you perform *data_in to retrieve a glvalue for the underlying character; is this well-defined and legal? The relevant rule here is the strict-aliasing rule (§3.10/10):

如果一个程序尝试通过的 glvalue 的其他比以下类型的行为是未定义的一个访问对象的存储值:

If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:


      
  • [...]

  •   
  • 一类型,它是符号或对应的动态对象的类型无符号类型,

  •   
  • [...]

  •   
  • 字符 unsigned char型键入

  •   
  • [...]
  • a type that is the signed or unsigned type corresponding to the dynamic type of the object,
  • [...]
  • a char or unsigned char type.

因此​​,访问符号字符(或字符)通过无符号字符* (或字符),反之亦然不受此规则&ndash的禁止;你应该能够做到这一点没有任何问题。

Therefore, accessing a signed char (or char) through an unsigned char* (or char) and vice versa is not disallowed by this rule – you should be able to do this without problems.

3)结果值结果
derefencing类型转换的指针后,您将能够与你得到的值来工作?牢记上面描述金额指针的转换和间接引用到reinter preting是很重要的(不改变!)存储在字符的地址的位模式。当一个符号字符的位模式为PTED作为一个无符号字符(反之亦然)的跨$ P $会发生什么?

3) Resulting values
After derefencing the type-converted pointer, will you be able to work with the value you get? It's important to bear in mind that the conversion and dereferencing of the pointer described above amounts to reinterpreting (not changing!) the bit pattern stored at the address of the character. So what happens when a bit pattern for a signed character is interpreted as that of an unsigned character (or vice versa)?

当从符号去签约,典型效应的将是为介于0和128没有任何反应值,高于128的值变为负值。反向类似:当从符号到无符号,负值将显示为值大于128

When going from unsigned to signed, the typical effect will be that for values between 0 and 128 nothing happens, and values above 128 become negative. Similar in reverse: When going from signed to unsigned, negative values will appear as values greater than 128.

但这种行为的实际上并不保证的的标准。该标准保证的唯一的事情是,所有这三种类型,字符 unsigned char型符号字符,所有位(不一定是8,顺便说一句)用于转口货值为presentation。所以,如果你跨preT一个作为其他,做一个几本,然后将其保存回原来的位置,可以确保不会有任何信息丢失(如您需要),但你不一定会知道什么是价值究竟意味着(至少不是完全可移植的方法)。

But this behaviour isn't actually guaranteed by the Standard. The only thing the Standard guarantees is that for all three types, char, unsigned char and signed char, all bits (not necessarily 8, btw) are used for the value representation. So if you interpret one as the other, make a few copies and then store it back to the original location, you can be sure that there will be no information loss (as you required), but you won't necessarily know what the values actually mean (at least not in a fully portable way).

这篇关于我可以把unsigned char型转换成char,反之亦然?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆