isdigit(c)-字符或整数类型? [英] isdigit(c) - a char or int type?

查看:80
本文介绍了isdigit(c)-字符或整数类型?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我编写了以下代码来测试给定的输入是否为数字.

I have written the following code to test if the given input is a digit or not.

#include<iostream>
#include<ctype.h>
#include<stdio.h>
using namespace std;

main()
{
    char c;

    cout<<"Please enter a digit: ";
    cin>>c;

    if(isdigit(c)) //int isdigit(int c) or char isdigit(char c)
    {
        cout<<"You entered a digit"<<endl;
    }
    else
    {
        cout<<"You entered a non-digit value"<<endl;
    }
}      

我的问题是:输入变量类型应该是什么?字符还是整数?

My question is: what should be the input variable type? char or int?

推荐答案

不幸的是,这种情况比其他答案更复杂.

The situation is unfortunately a bit more complex than has been told by the other answers.

首先:您的代码的第一部分是正确的(忽略多字节编码);如果要使用 cin 读取单个 char ,则必须使用 char 变量和>> 运算符.

First of all: the first part of your code is correct (disregarding multiple-byte encodings); if you want to read a single char with cin, you'll have to use a char variable with >> operator.

现在,关于 isdigit :为什么要用 int 而不是 char ?

Now, about isdigit: why does it take an int instead of a char?

全部来自C; isdigit 及其伴侣诞生于与 getchar()之类的功能一起使用,该功能从流中读取字符并返回 int .依次执行此操作是为了提供字符错误代码: getchar()可以返回 EOF (已定义为某些实现定义)负常数),以其返回码表示输入流已结束.

It all comes from C; isdigit and its companion were born to be used along with functions like getchar(), which read a character from the stream and return an int. This in turn was done to provide the character and an error code: getchar() can return EOF (which is defined as some implementation-defined negative constant) through its return code to signify that the input stream has ended.

因此,基本思想是:否定=错误代码;正=实际字符代码.

So, the basic idea is: negative = error code; positive = actual character code.

不幸的是,这带来了与常规" char s的互操作性问题.

Unfortunately, this poses interoperability problems with "regular" chars.

简短的题外话: char 最终只是一个整数类型,范围很小,但是却非常愚蠢.在大多数情况下-使用字节或字符代码时-您希望默认情况下将其设置为 unsigned ;OTOH,出于与其他整数类型( int short long ,...)的一致性原因,您可能会说正确的事情会是普通的 char 应该被签名.标准选择了最愚蠢的方式:普通 char signed unsigned ,具体取决于编译器的实现者决定 1.

Short digression: char ultimately is just an integral type with a very small range, but a particularly stupid one. In most occasions - when working with bytes or character codes - you'd want it to be unsigned by default; OTOH, for coherency reasons with other integral types (int, short, long, ...), you may say that the right thing would be that plain char should be signed. The Standard chose the most stupid way: plain char is either signed or unsigned, depending from whatever the implementor of the compiler decides1.

因此,您必须为 char signed unsigned 做好准备;在大多数实现中,默认情况下使用 signed 签名,这对上面的 getchar()布置造成了问题.

So, you have to be prepared for char being either signed or unsigned; in most implementations it's signed by default, which poses a problem with the getchar() arrangement above.

如果使用 char 读取字节并进行了 signed 签名,则表示所有设置了高位的字节(也就是使用 unsigned 8位类型将> 127)变成负值.这显然与使用 EOF 的负值的 getchar()不兼容-实际的负"字符和 EOF 之间可能存在重叠

If char is used to read bytes and is signed it means that all bytes with the high bit set (AKA bytes that, read with an unsigned 8-bit type would be >127) turn out to be negative values. This obviously isn't compatible with the getchar() using negative values for EOF - there could be overlap between actual "negative" characters and EOF.

因此,当C函数谈论将字符接收/提供给 int 变量时,协定始终是假定该字符为已被强制转换为字符的 char . unsigned char (以使其始终为正,负值溢出到其范围的上半部),然后放入 int .这使我们回到 isdigit 函数,该函数连同其伴随函数也具有以下约定:

So, when C functions talk about receiving/providing characters into int variables the contract is always that the character is assumed to be a char that has been cast to an unsigned char (so that it is always positive, negative values overflowing into the top half of its range) and then put into an int. Which brings us back to the isdigit function, which, along its companion functions, has this contract as well:

头文件< ctype.h> 声明了一些对字符进行分类和映射的函数.在所有情况下,该参数均为 int ,其值应表示为 unsigned char 或等于宏 EOF 的值.如果该参数具有任何其他值,则行为是不确定的.

The header <ctype.h> declares several functions useful for classifying and mapping characters. In all cases the argument is an int, the value of which shall be representable as an unsigned char or shall equal the value of the macro EOF. If the argument has any other value, the behavior is undefined.

(C99,§7.4,¶1)

(C99, §7.4, ¶1)

长话短说: if 至少应为:

if(isdigit((unsigned char)c))

问题不只是理论上的问题:一些广泛的C库实现将提供的值直接用作查找表的索引,因此,负值将读入未分配的内存并对程序进行段错误.

The problem is not just a theoretical one: several widespread C library implementations use the provided value straight as an index into a lookup table, so negative values will read into unallocated memory and segfault your program.

此外,您没有考虑到流可能已关闭的事实,因此>> 会返回而不会触碰您的变量(变量将处于未初始化的值);考虑到这一点,您应该先检查流是否仍处于有效状态,然后再使用 c .

Also, you are not taking into account the fact that the stream may be closed, and thus >> returning without touching your variable (which will be at an uninitialized value); to take this into account, you should check if the stream is still in a valid state before working on c.

  1. 这当然有点不公平;正如 @Pete Becker 在下面的评论中指出的那样,这并不是说它们都是白痴,而是该标准主要尝试与现有实现兼容,这可能在未签名和已签名字符.这种分裂的痕迹可以在大多数现代编译器中找到,它们通常可以通过命令行选项( -fsigned-char / -funsigned-char (用于gcc/clang,在VC ++中为/J ).
  1. Of course this is a bit of an unfair rant; as @Pete Becker noted in the comment below, it's not like they were all morons, but just that the standard mostly tried to be compatible with existing implementations, which were probably evenly split between unsigned and signed char. Traces of this split can be found in most modern compilers, which can generally change the signedness of char through command line options (-fsigned-char/-funsigned-char for gcc/clang, /J in VC++).

这篇关于isdigit(c)-字符或整数类型?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆