关于在C语言族中使用有符号整数 [英] About the use of signed integers in C family of languages

查看:85
本文介绍了关于在C语言族中使用有符号整数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我自己的代码中使用整数值时,我总是尝试考虑有符号性,问自己应该对整数进行带符号还是无符号。



确保值永远不需要为负,然后使用无符号整数。

而且我不得不说这种情况在大多数情况下都会发生。



在阅读别人的代码时,即使表示的值不能为负,我也很少看到无符号整数。



所以我问自己:«是有充分的理由吗?还是人们只是因为有不在意而使用带符号的整数?



我已经搜索了这个主题,在这里和其他地方,我不得不说找不到适用的无符号整数的充分理由。



我遇到了这些问题: «默认整数类型:带符号还是无符号?»和«»都显示以下示例:

  for(unsigned int i = foo.Length()-1 ;我> = 0; --i){} 

对我来说,这只是一个糟糕的设计。当然,它可能会导致无符号整数的无限循环。

但是很难检查 foo.Length()是否是0,在循环之前?



所以我个人不认为这是一路使用有符号整数的好理由。



有些人可能还会说,即使对于非负值,带符号整数也可能有用,以提供错误标志,通常为 -1



好的,最好有一个表示错误的特定值。

但是,诸如这样的东西有什么问题呢UINT_MAX ,对于那个特定值?



我实际上是在问这个问题,因为它可能会导致一些大问题,通常在使用缔约方图书馆。



在这种情况下,您通常必须处理带符号和无符号的值。



大多数有时候,人们只是不在乎签名,而只是将 unsigned int 分配给 signed int ,而不检查范围。



我不得不说我有点带有编译器警告标志的偏执症,因此在我的设置中,这样的隐式强制转换将导致编译器错误。



对于这种东西,我通常使用函数或宏来检查范围,然后使用显式强制转换进行分配,并在需要时引发错误。



这对我来说似乎很合逻辑。



最后一个例子,因为我也是Objective-C开发人员(请注意,此问题仅与Objective-C不相关):

 -(NSInteger)tableView:(UITableView *)tableView numberOfRowsInSection:(NSInteger)节; 

对于那些不太熟练使用Objective-C的用户, NSInteger 是有符号整数。

此方法实际上检索表视图中特定部分的行数。



结果从不为负值(顺便说一下,它是段号)。



那么为什么要为此使用带符号整数?

我真的不明白。



这只是一个例子,但我总是会看到这种东西,包括C,C ++或Objective -C。



再说一次,我只是想知道人们是否不在乎这种问题,或者最终是否有一个好的有效的原因,在这种情况下不使用无符号整数。



期待听到您的回答:)

解决方案


在我自己的代码中使用整数值时,我总是尝试考虑符号性,问自己是否为整数


当我确定该值永远不必为负数时,我便使用一个无符号整数。
而且我不得不说大多数时候都会发生这种情况。


要仔细考虑每次声明变量时哪种类型最合适是很好的做法!这意味着您要小心而专业。您不仅应该考虑符号性,而且还应该考虑期望该类型具有的潜在最大值。


在不需要符号类型时不使用符号类型的原因一无所有与性能有关,但与类型安全有关。签名类型可能会导致许多潜在的细微错误:



  • C中存在的各种形式的隐式促销会导致您键入以意外的和可能的危险方式更改签名。 整数提升规则,它是常用算术转换,分配时的左值转换默认参数提升的一部分>例如,由VA列表使用,依此类推。



  • 使用任何形式的按位运算符或类似的硬件相关编程时,带符号类型是危险的,很容易导致各种形式的不确定行为。




通过声明整数为无符号,您会自动跳过以上所有危险。同样,通过将它们声明为 unsigned int 或更大,可以消除整数促销带来的许多危险。


在编写坚固,可移植和安全的代码时,大小和签名都非常重要。这就是为什么您应该始终使用 stdint.h 中的类型,而不是使用本机的所谓原始数据类型的原因。 C。





所以我问自己:这是否有充分的理由,或者人们只是使用有符号整数?因为不在乎??


我真的不认为这是因为他们不在乎,也不是因为他们懒惰,即使声明所有 int 有时被称为草率键入;


我宁愿相信这是因为他们缺乏对我上面提到的各种事物的深入了解。有很多经验丰富的C程序员,他们不知道隐式类型提升在C中的工作方式,也不知道与某些运算符一起使用时带符号的类型如何导致定义不明确的行为。


实际上是细微错误的常见来源。许多程序员发现自己盯着编译器警告或特殊错误,可以通过添加强制转换来消除它们。





for(unsigned int i = foo .Length()-1; i> == 0; --i){}


对我来说,这只是一个糟糕的设计



一次,向下计数循环将产生更有效的代码,因为编译器选择添加一个如果为零则分支。指令而不是如果较大/较小/相等则分支。说明-前者更快。但这是在编译器真的很笨的时候,我不认为这样的微优化不再有意义。


因此,几乎没有理由进行递减计数循环。提出论点的人可能根本无法思考。该示例可能被重写为:

  for(unsigned int i = 0; i< foo.Length(); i ++)
{
unsigned int index = foo.Length()-i-1;
something [index] =某物;
}

这段代码不会对性能产生任何影响,但是循环本身变得非常容易


就当今的性能而言,人们可能应该花时间思考哪种形式的数据访问是最重要的。





有人会说带符号整数可能是有用的,即使是非负值,也要提供错误标志,通常为-1。


这是一个很差的说法。好的API设计使用专门的错误类型来报告错误,例如枚举。


而不是像

int do_stuff(int a,int b); //如果a或b无效,则返回-1,否则结果

您应该具有以下内容:

  err_t do_stuff(int32_t a,int32_t b,int32_t *结果); 

//返回ERR_A为a无效,ERR_B如果b为无效,ERR_XXX如果...,依此类推
//结果存储在[结果]中,由调用者
//错误时[结果]的内容保持不变

然后,API将始终保留


(是的,很多标准库函数都将返回类型用于错误处理。这是因为它包含了许多以前的古老函数发明了良好的编程习惯,并且由于向后兼容的原因而保留了它们的原样,因此,仅仅因为您在标准库中发现编写不良的函数,您就不应自己编写同样糟糕的函数。 )




总体来说,这听起来像您知道自己在做什么,并给了签名一些思想。这可能意味着在知识方面,您实际上已经领先于您所引用的那些文章和指南的撰写者。


例如,Google样式指南令人怀疑。关于使用授权证明的许多其他此类编码标准,也可以说类似的话。仅仅因为它说的是Google,NASA或Linux内核,无论实际内容的质量如何,人们都会盲目地吞下它们。这些标准中有很多东西,但是它们也包含主观意见,推测或公然错误。


相反,我建议改为参考真正的专业编码标准,例如 MISRA-C 。它对签名,类型提升和类型大小之类的东西进行了很多思考和照顾,其中不太详细/不太认真的文档就跳过了它。 ://www.securecoding.cert.org/confluence/display/c/SEI+CERT+C+Coding+Standard rel = nofollow noreferrer> CERT C ,它不如其详细和谨慎MISRA,但至少是一份完善的专业文档(并且更多地关注于台式机/托管开发)。


When using integer values in my own code, I always try to consider the signedness, asking myself if the integer should be signed or unsigned.

When I'm sure the value will never need to be negative, I then use an unsigned integer.
And I have to say this happen most of the time.

When reading other peoples' code, I rarely see unsigned integers, even if the represented value can't be negative.

So I asked myself: «is there a good reason for this, or do people just use signed integers because the don't care»?

I've search on the subject, here and in other places, and I have to say I can't find a good reason not to use unsigned integers, when it applies.

I came across those questions: «Default int type: Signed or Unsigned?», and «Should you always use 'int' for numbers in C, even if they are non-negative?» which both present the following example:

for( unsigned int i = foo.Length() - 1; i >= 0; --i ) {}

To me, this is just bad design. Of course, it may result in an infinite loop, with unsigned integers.
But is it so hard to check if foo.Length() is 0, before the loop?

So I personally don't think this is a good reason for using signed integers all the way.

Some people may also say that signed integers may be useful, even for non-negative values, to provide an error flag, usually -1.

Ok, that's good to have a specific value that means «error».
But then, what's wrong with something like UINT_MAX, for that specific value?

I'm actually asking this question because it may lead to some huge problems, usually when using third-party libraries.

In such a case, you often have to deal with signed and unsigned values.

Most of the time, people just don't care about the signedness, and just assign a, for instance, an unsigned int to a signed int, without checking the range.

I have to say I'm a bit paranoid with the compiler warning flags, so with my setup, such an implicit cast will result in a compiler error.

For that kind of stuff, I usually use a function or macro to check the range, and then assign using an explicit cast, raising an error if needed.

This just seems logical to me.

As a last example, as I'm also an Objective-C developer (note that this question is not related to Objective-C only):

- ( NSInteger )tableView: ( UITableView * )tableView numberOfRowsInSection: ( NSInteger )section;

For those not fluent with Objective-C, NSInteger is a signed integer.
This method actually retrieves the number of rows in a table view, for a specific section.

The result will never be a negative value (as the section number, by the way).

So why use a signed integer for this?
I really don't understand.

This is just an example, but I just always see that kind of stuff, with C, C++ or Objective-C.

So again, I'm just wondering if people just don't care about that kind of problems, or if there is finally a good and valid reason not to use unsigned integers for such cases.

Looking forward to hear your answers : )

解决方案

When using integer values in my own code, I always try to consider the signedness, asking myself if the integer should be signed or unsigned.

When I'm sure the value will never need to be negative, I then use an unsigned integer. And I have to say this happen most of the time.

To carefully consider which type that is most suitable each time you declare a variable is very good practice! This means you are careful and professional. You should not only consider signedness, but also the potential max value that you expect this type to have.

The reason why you shouldn't use signed types when they aren't needed have nothing to do with performance, but with type safety. There are lots of potential, subtle bugs that can be caused by signed types:

  • The various forms of implicit promotions that exist in C can cause your type to change signedness in unexpected and possibly dangerous ways. The integer promotion rule that is part of the usual arithmetic conversions, the lvalue conversion upon assignment, the default argument promotions used by for example VA lists, and so on.

  • When using any form of bitwise operators or similar hardware-related programming, signed types are dangerous and can easily cause various forms of undefined behavior.

By declaring your integers unsigned, you automatically skip past a whole lot of the above dangers. Similarly, by declaring them as large as unsigned int or larger, you get rid of lots of dangers caused by the integer promotions.

Both size and signedness are important when it comes to writing rugged, portable and safe code. This is the reason why you should always use the types from stdint.h and not the native, so-called "primitive data types" of C.


So I asked myself: «is there a good reason for this, or do people just use signed integers because the don't care»?

I don't really think it is because they don't care, nor because they are lazy, even though declaring everything int is sometimes referred to as "sloppy typing" - which means sloppily picked type more than it means too lazy to type.

I rather believe it is because they lack deeper knowledge of the various things I mentioned above. There's a frightening amount of seasoned C programmers who don't know how implicit type promotions work in C, nor how signed types can cause poorly-defined behavior when used together with certain operators.

This is actually a very frequent source of subtle bugs. Many programmers find themselves staring at a compiler warning or a peculiar bug, which they can make go away by adding a cast. But they don't understand why, they simply add the cast and move on.


for( unsigned int i = foo.Length() - 1; i >= 0; --i ) {}

To me, this is just bad design

Indeed it is.

Once upon a time, down-counting loops would yield more effective code, because the compiler pick add a "branch if zero" instruction instead of a "branch if larger/smaller/equal" instruction - the former is faster. But this was at a time when compilers were really dumb and I don't believe such micro-optimizations are relevant any longer.

So there is rarely ever a reason to have a down-counting loop. Whoever made the argument probably just couldn't think outside the box. The example could have been rewritten as:

for(unsigned int i=0; i<foo.Length(); i++)
{
  unsigned int index = foo.Length() - i - 1;
  thing[index] = something;
}

This code should not have any impact on performance, but the loop itself turned a whole lot easier to read, while at the same time fixing the bug that your example had.

As far as performance is concerned nowadays, one should probably spend the time pondering about which form of data access that is most ideal in terms of data cache use, rather than anything else.


Some people may also say that signed integers may be useful, even for non-negative values, to provide an error flag, usually -1.

That's a poor argument. Good API design uses a dedicated error type for error reporting, such as an enum.

Instead of having some hobbyist-level API like

int do_stuff (int a, int b); // returns -1 if a or b were invalid, otherwise the result

you should have something like:

err_t do_stuff (int32_t a, int32_t b, int32_t* result);

// returns ERR_A is a is invalid, ERR_B if b is invalid, ERR_XXX if... and so on
// the result is stored in [result], which is allocated by the caller
// upon errors the contents of [result] remain untouched

The API would then consistently reserve the return of every function for this error type.

(And yes, many of the standard library functions abuse return types for error handling. This is because it contains lots of ancient functions from a time before good programming practice was invented, and they have been preserved the way they are for backwards-compatibility reasons. So just because you find a poorly-written function in the standard library, you shouldn't run off to write an equally poor function yourself.)


Overall, it sounds like you know what you are doing and giving signedness some thought. That probably means that knowledge-wise, you are actually already ahead of the people who wrote those posts and guides you are referring to.

The Google style guide for example, is questionable. Similar could be said about lots of other such coding standards that use "proof by authority". Just because it says Google, NASA or Linux kernel, people blindly swallow them no matter the quality of the actual contents. There are good things in those standards, but they also contain subjective opinions, speculations or blatant errors.

Instead I would recommend referring to real professional coding standards instead, such as MISRA-C. It enforces lots of thought and care for things like signedness, type promotion and type size, where less detailed/less serious documents just skip past it.

There is also CERT C, which isn't as detailed and careful as MISRA, but at least a sound, professional document (and more focused towards desktop/hosted development).

这篇关于关于在C语言族中使用有符号整数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆