计算每个字母出现在字符串中的次数 [英] Calculate the number of times each letter appears in a string

查看:68
本文介绍了计算每个字母出现在字符串中的次数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在处理一些旧代码,遇到了我前一段时间做的一个函数,该函数计算每个字母在给定字符串中出现的次数.在我的初始函数中,我将遍历字符串26次,计算每个字母在遍历时出现的次数.但是,我知道这确实效率很低,所以我尝试这样做:

I've been playing around with some old code, and I came across a function that I made a while ago that calculates the number of times each alphabetical letter appears in a given string. In my initial function, I would loop through the string 26 times counting the number of times each letter appears as it loops through. However, I knew that was really inefficient, so instead I tried to do this:

int *frequency_table(char *string) { 
    int i;
    char c;
    int *freqCount = NULL;
    freqCount = mallocPtr(freqCount, 26, sizeof(int), "freqCount"); /* mallocs and checks for out of memory */

    for (i = 0; string[i] != '\0'; i++) {
        c = string[i];
        if (isalpha(c)) {
            isupper(c) ? freqCount[c - 65]++ : freqCount[c - 97]++;
        }
    }

    return (freqCount);
}

上面的代码遍历字符串并检查每个字符.如果字符是字母(az或AZ),那么我将在 freqCount 数组中的特定索引处增加频率计数(其中索引0 = a \ A,1 = b \ B,..... 25 = z \ Z).

The code above loops through a string and checks each character. If the character is an alphabetic letter (a-z or A-Z), then I increment the frequency count at a specific index in the freqCount array (where index 0 = a\A, 1 = b\B, ... , 25 = z\Z).

代码似乎计数正常,但是当我打印数组时,得到以下输出:

The code seems to be counting fine, but when I print the array, I get the following output:

字符串:"abcdefghijklmnopqrstuvwxyziii"

String: "abcdefghijklmnopqrstuvwxyziii"

a/A     -1276558703
b/B     32754
c/C     -1276558703
d/D     32754
e/E     862570673
f/F     21987
g/G     862570673
h/H     21987
i/I     4
j/J     1
k/K     1
l/L     1
m/M     1
n/N     1
o/O     1
p/P     1
q/Q     1
r/R     1
s/S     1
t/T     1
u/U     1
v/V     1
w/W     1
x/X     1
y/Y     1
z/Z     1

作为参考,我以以下方式打印数组:

For reference, I'm printing the array in the following manner:

for (i = 0; i < 26; i++) {
     printf("%c/%c     %d\n", i + 97, i + 65, freqCount[i]);
}

我检查以确保指针分配正确,我确定我没有覆盖此内存位置.也许我错过了一些东西,但我真的无法弄清楚为什么它从a \ A-h \ H中打印出垃圾内存值.

I checked to make sure that the pointer allocated properly, I know for sure I didn't overwrite this memory location. Maybe I'm missing something but I really can't figure out why it's printing garbage memory values from a\A-h\H.

此外,如果有一种更有效的方式来做我想做的事,我很想听听.

Also, if there is a more efficient way to do what I'm trying to do, I'd love to hear it.

谢谢

推荐答案

您的代码中有2个问题:

There are 2 problems in your code:

  • 数组 freqCount 未初始化.
  • 您应避免将 char 值传递给 isalpha ,因为如果 string 包含负的 char ,这将导致未定义的行为默认情况下对 char 进行签名的系统上的值.
  • the array freqCount is uninitialized.
  • you should avoid passing char values to isalpha because it would cause undefined behavior if string contains negative char values on systems where char is signed by default.

可以使用 toupper()代替三元运算符或 if 语句,将小写字符转换为大写字母,并且编写更具可读性而不是其硬编码ASCII值 65 97 的'A''a'.

Instead of a ternary operator or an if statement, you can use toupper() to convert lowercase characters to uppercase, and it is more readable to write 'A' or 'a' instead of their hard coded ASCII values 65 and 97.

这是更正的版本:

int *frequency_table(const char *string) { 
    size_t i;

    /* allocate the array with malloc and check for out of memory */
    int *freqCount = mallocPtr(freqCount, 26, sizeof(int), "freqCount");

    for (i = 0; i < 26; i++) {
        freqCount[i] = 0;
    }
    for (i = 0; string[i] != '\0'; i++) {
        unsigned char c = string[i];
        if (isalpha(c)) {
            /* this code assumes ASCII, so 'Z'-'A' == 25 */
            freqCount[toupper(c) - 'A']++;
        }
    }
    return freqCount;
}

这篇关于计算每个字母出现在字符串中的次数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆